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Abstract 


The Internet provides much information about learning opportunities, in- 
cluding courses at universities, conférences, and any other types of training 
programs. Often, this information is cluttered, which increases the need for 
organizing and accessing the information in a way that is useful, structured, 
and comprehensive. The Metadata for Learning Opportunities Advertising 
(MLO-AD) standard spécifiés the data model for learning opportunities. It 
defines which information an educational event must or can provide to make 
it comparable with other programs. This thesis describes a concept for an 
aggregator portai which collects data based on the MLO-AD standard from 
multiple databases and consolidâtes it. Moreover, the portai allows infor- 
mation about learning opportunities to be accessed in a comprehensive and 
structured way, and keeps the data up-to-date. The emphasis of this concept 
is placed on different solutions of data collection that use the complex struc- 
ture of MLO-AD. Besides the analysis for collecting the required information 
and the exploration of the design for such a portai, this thesis results in an 
implémentation of a prototype. This prototype is the first MLO-AD-based 
web application which allows prospective learners to search for an educa- 
tional event and providers of learning opportunities to advertise their offers. 



Kurzfassung 


Das Internet bietet zahlreiche Informationen über Lernmoglichkeiten. Dieser 
Begriff schlietët Universitâtskurse, Lernseminare und -konferenzen, sowie aile 
weiteren Arten von Lehrveranstaltungen ein. Hàufig jedoch sind diese Infor- 
mationen verstreut, was den Bedarf an einen nützlichen, strukturierten und 
umfassenden Zugang zu diesen erhôht. Der Metadata for Learning Opportu- 
nités Advertising (MLO-AD) Standard definiert das Datenmodell für Lern- 
môglichkeiten. MLO-AD bestimmt somit, welche Informationen von einer 
Lehrveranstaltung bereit gestellt werden kônnen oder müssen, damit diese 
miteinander vergleichbar sind. Diese Diplomarbeit beschreibt das Konzept 
für ein Aggregator Portai, das Daten basierend auf dem MLO-AD Standard 
von mehreren Datenbanken sammelt. Zudem ermôglicht das Portai einen Zu- 
gang zu den strukturierten und umfassenden Informationen von Lernmôg- 
lichkeiten und aktualisiert diese regelmâtëig, um sie auf dem neuesten Stand 
zu halten. Der Schwerpunkt dieser Diplomarbeit liegt auf der Analyse von 
unterschiedlichen Datenerfassungsmôglichkeiten, wobei die Daten die kom- 
plexe Struktur von MLO-AD verwenden. Neben dieser Untersuchung und 
der Designerforschung eines solchen Systems umfasst diese Arbeit auch die 
Implementierung eines Prototyps. Dieser Prototyp ist die erste Webanwen- 
dung basierend auf MLO-AD, die zukünftigen Lernenden die Suche nach 
Lehrveranstaltungen und Anbietern von Lernmoglichkeiten die Werbung ih- 
rer Angebote ermôglicht. 



Chapter 1 

Introduction 


Learning is not a product of schooling, 

but the lifelong attempt to acquire it. 

Albert Einstein (1879-1955) 

Lifelong learning is important for many people due to the wish of know- 
ledge enrichment for private or professional reasons, but also the need for 
employability and competitiveness. With the increasing demand for educa- 
tional offers and the progressive development of technologies that can be 
used for éducation, the supply of learning opportunities has been rising. 

In this thesis, the term learning opportunity is used as a reference 
to ail events that provide a context of learning. This includes inter 
alia courses at schools and universities, forums, conférences, as 
well as ail kinds of training programs. 

Because of the increasing number of learning opportunities, the rising avail- 
ability of borderless éducation (anytime, anywhere), as well as the trend 
towards more mobility in éducation [23, p. 142], learners hâve more possibil- 
ities to choose their courses. However, this wide range of opportunities also 
makes it more difHcult to get a survey of ail offers. 

To get information about learning opportunities, many learners use edu- 
cational portais directly from a spécifie educational organization or from a 
provider which collects data from more institutions. Systems that gather data 
from multiple sites and consolidate it onto one page are called aggregators. 

Aggregation is the process of creating a compound object from 
several smaller ones [19, p. 4]. 


1 
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Usually, the decision for attending a course or any other educational program 
is preceded by retrieving at least the main information about this educational 
event as, for example, subject, time, location, costs, qualification, and level 
or quality of course. To ensure the provision of these properties and, more 
importantly, to be able to compare learning opportunities, a common struc- 
ture for the data of educational offers must be defined. Uniform models of 
data or any technology are guaranteed by standards or spécifications. 

In December 2007, the Metadata for Learning Opportunity (MLO) project 
group started to develop the MLO-Advertising (MLO-AD) standard that 
addresses metadata sufficient for advertising a learning opportunity [39, p. 2]. 
MLO-AD provides learners with information about an educational event and 
helps them to compare different offers. Since October 2008, MLO-AD has 
been in development as a formai European standard. Also in Canada, a 
group is contributing to the development of MLO-AD. The working group 
GTN- Québec 1 supplies the educational community with expertise in the area 
of e-Learning object standards and aims to propose an application profile 
based on MLO-AD. Different interest groups contribute to the targets of 
GTN- Québec. One of them is the Computer Research Center of Montréal 
(CRIM) 2 . CRIM is an information technologies-applied research center that 
develops and transfers technologies and knowledge. Additionally, it provides 
a training center for information technology. As CRIM and GTN-Québec are 
generally interested in the developments of MLO-AD, these organizations 
made the project of this thesis possible and hâve strongly supported the 
création of this work. 

This thesis describes the concept of an aggregator portai that provides infor- 
mation about learning opportunities, which are structured by the data model 
of MLO-AD. Thereby, the emphasis of the analysis is set on possibilities for 
collecting information from different learning opportunity providers, which 
include ail institutions and instructors offering educational events. The chal- 
lenge of this research is posed by the complex structure of MLO-AD that 
makes the collection and transfer of such data more difEcult. Besides the 
concept of the portai, this work includes the implémentation of a prototype. 
This prototype is the first aggregator portai based on the MLO-AD standard 
that provides a search functionality and the possibility to advertise learning 
opportunities. 

The structure of this work is as follows: Chapter 2 gives a review of prior 
standards for describing learning opportunities, the MLO-AD standard itself, 
existing portais that offer search functionality for educational events, and 
existing aggregator Systems. Chapter 3 describes the target audience and 
the requirements of the aggregator portai based on MLO-AD, as well as the 

1 http://www.gtn-quebec.org 

2 http://www.crim.ca 
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criteria for the prototype. Different technologies for collecting data about 
learning opportunities are analyzed in Chapter 4, while the emphasis is set 
on solutions for the complex structure of MLO-AD. Chapter 5 addresses 
the user interface and the possibilities for interaction with the aggregator 
portai. The implémentation of the web application, which collects data from 
learning opportunity providers, is described in Chapter 6 and reflected in 
Chapter 7. The latter also évaluâtes the analyses of this thesis. Appendix A 
describes and lists the data saved on the enclosed CD-ROM. 



Chapter 2 

State of the Art 


MLO-AD is not the first standard that offers a common structure to describe 
and compare learning opportunities. Different spécifications across Europe 
and in the USA hâve already been used for several years. Some of them also 
build the basis of web applications that allow users to search for information 
about learning opportunities. 

This chapter évaluâtes existing standards for describing learning opportuni- 
ties to gain a better insight into the development of MLO-AD, as well as the 
connections and influences of those standards among each other. The his- 
tory, the structure, and the specialties of MLO-AD are likewise addressed. 
One following section also takes a doser look at some available web por- 
tais that offer search functionality for educational events. To give an insight 
into the System of aggregators, représentative data collectors are additionally 
described in this chapter. 


2.1 Existing Standards 

Universities or other educational institutions hâve used various data formats, 
data models or terminology to describe their offers in an electronic way. This 
makes a comparison of learning opportunities and interoperability between 
portais of these institutions very difficult or impossible. Non-interoperability 
of data is usually caused by the lack of widely accepted standards or spéci- 
fications. 

Across Europe, a trend to develop comparable descriptions of educational 
offers arose after the Bologna Process, which was initiated in June 1999. 
Today, 46 countries are committed to achieving the goals of this process and 
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to creating the European Higher Education Area (EHEA) 1 . The Bologna 
Déclaration of June 19, 1999 says [13, p. 7]: 

The achievement of greater compatibility and comparability of 
the Systems of higher éducation nevertheless requires continuai 
momentum in order to be fully accomplished. We need to sup- 
port it through promoting concrète measures to achieve tangible 
forward steps. 

Therefore, European governments hâve strongly encouraged projects which 
targeted standardization of course descriptions to achieve a better compar- 
ison of educational offers and interoperability between different educational 
Systems. 


2.1.1 CDM and CDM-fr in Norway and France 

The spécification Course Description Metadata (CDM) 2 was developed in 
2001 by USIT’s XML group 3 at the University of Oslo in order to make 
study programs or course units comparable and university portais interoper- 
able. CDM is a part of the Norwegian e-Standard project, Norway Opening 
Universities (NOU) 4 , a national initiative for change and innovation in Nor- 
wegian higher éducation. It is expressed as a W3C XML schéma document 5 
and has been adopted by ail Norwegian universities as a base for Norway’s 
national educational portai. CDM was presented in 2005 to the CEN/ISSS 
as a "candidate standard" for course description [30, p. 2]. The Information 
Society Standardization System (ISSS) is the Information and Communi- 
cation Technologies sector of the European Committee for Standardization 
(CEN), which is a business facilitator in Europe, and provides a platform for 
the development of European standards and other technical spécifications 6 . 

According to [20, p. 2], the concept of CDM is defined as follows: 

CDM addresses the description of educational course units or 
other forms of educational offerings at ail levels. It spécifiés the 
structure and semantics of the key concepts used in course de- 
scriptions. The metadata are specified as an XML schéma, and 
guidelines with examples are given to facilitate the génération 

1 http://www.ond.vlaanderen.be/hogeronderwijs/bologna 

2 http://cdm.utda nning.no/CDM 
3 http://www. usit.uio.no/saus/xml (in Norwegian) 

4 http://norgesuniversitetet.no/seksjoner/english 

6 http://www.w3.org/XML/Schema 

6 http://www.cen.eu/cenorm/businessdomains/businessdomains/isss 
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of course descriptions as XML documents. The metadata are in- 
tended to satisfy the following objectives: 

• Facilitate description and exchange of information about 
educational course units, 

• Facilitate standardization of course unit descriptions, 

• Facilitate the establishment of national and international 
course catalogs, 

• Facilitate the establishment of course portais and other ser- 
vices helping students. 

Besides the information about the course content or study program, students 
need to know details related to the courses to be able to choose to study at 
a certain institution. Therefore, CDM not only lists and describes course 
units and their content, but also supplies further information needed by 
students [30, pp. 3-4]. The data format is divided into four main parts, as is 
apparent from the XML schéma of CDM 7 : 

• Organization unit (orgUnitType): Information about the institution, 
like description, kind of institution, student facilities, régulations, ad- 
mission. 

• Study program (program Type): Description of a program comprising 
a set of course units. This concept includes, for example, name, de- 
scription, qualification, level, prerequisites, teaching form, expenses, 
program structure and duration. 

• Course unit (courseType): Information on course content, degree, créd- 
its, level, syllabus, admission and prerequisites, teaching place, lan- 
guage and more. 

• Contact person (personType): Information on ail the relevant data of 
the contact person. 

CDM fully supports the European Crédit Transfer and Accumulation System 
(ECTS) and Diploma Supplément (DS) for comparing the study attainment 
of students of higher éducation across Europe [20]. This means it covers 
ail specified ECTS items which make study programs easy to understand 
and comparable for ail students and staff — local and foreign. CDM is also 
compatible with a number of existing standard spécifications, such as IEEE 
Learning Object Metadata (LOM) for course content description [30, p. 2], 

7 http://utda nning.no/schemas/CDM /2.0.4/CDM.xsd 
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CDM-fr 

Within a project coordinated by the French Ministry of Education, French 
universities and other higher éducation institutions modified the CDM spéc- 
ification and developed CDM-fr 8 . The project group started in March 2004 
with the goal of adjusting the XML schéma of CDM to the needs of univer- 
sities in France. CDM-fr keeps the four main parts of Norway’s CDM for its 
XML schéma to describe courses and study programs. However, CDM-fr has 
some new or modified spécifications in the main parts of the schéma, uses 
new vocabulary for France, and is extended with the domain "habilitation" 
for authorizations. A mind map on the website of CDM-fr 9 illustrâtes the 
structure of the modified XML schéma. CDM-fr is accepted as an AFNOR 10 
standard, which is the standards association of France. 

Many universities ail over France use CDM-fr for describing and advertising 
their courses and training programs. Information about these projects and 
the institutions are collected on the website of CDM-fr 11 . 


2.1.2 EMIL in Sweden 

The history of the Swedish spécification Education Information Markup Lan- 
guage (EMIL) 12 started in the same year as CDM — 2001. The National 
Agency of Higher Education developed this metadata model as an XML 
schéma for their online service www.studera.nu to describe parts of the 
Swedish educational sector. In the beginning of 2003, a steering commit- 
tee was constituted to build a national catalog containing information about 
ail courses and programs within the publicly financed educational sector of 
Sweden. The committee was comprised of représentatives from the National 
Agency for Education, the National Labor Market Board (AMS) and the 
National Agency for School Improvement (MSU) [27, p. 4]. This catalog 
is built on an information service that collects and distributes the course 
information files, as well as a common information model to describe the 
course information. EMIL is the chosen model for the national catalog. The 
goal of EMIL is to provide a metadata model for many different information 
producers like universities, colleges, folk high schools, and municipalities. 

As described in [27, p. 8], the XML schéma of EMIL is based on following 
three concepts: 

8 http://cdm-fr.fr (in French) 

9 http://cdm-fr.fr/ressources/cartes-heuristiques-cdm-fr/cartes-cdmf 
10 http://www.afnor.org/portail.asp?lang=English 

11 http://cdm-fr.fr/ressources/ressources/carte-cdmfr- France (in French) 

12 http://mjukis. skolutveckling.se (in Swedish) 
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• EducationProvider : Describes the provider of the éducation based on 
the vCard standard (RFC 2426). This concept contains the school’s 
name and address, as well as information about the contact person for 
each particular provider. 

• Educationlnfo : Contains general information about a certain course or 
éducation. The concept in éludés, e.g., course name and course descrip- 
tion. 

• EducationEvent : Describes a certain éducation event, such as the start 
date and application code. This concept refers to one Educationlnfo 
and one EducationProvider. 

EMIL’s spécification includes the définition of an "information hub," which is 
an information service that collects and distributes EMIL files and provides 
course information to various end-user services [27, pp. 4-5]. One advantage 
of an information hub is the allocation of a cache in the System between the 
providers and the end-user services for better accessibility and performance. 
Other reasons are to provide a common and neutral arena which ail informa- 
tion providers can contribute to or to be able to cater to the needs of many 
services. An example for information services using EMIL is the SUS A hub 
run by the National Agency for Education and used by the end-user service 
Utbildningsinfo.se. 

2.1.3 XCRI in the United Kingdom 

The project eXchanging Course-Related Information (XCRI) began in April 
2005 [34, p. 1], It is funded by the Joint Information Systems Committee 
(JISC) 13 and opérâtes in partnership between Manchester Metropolitan Uni- 
versity, the JISC-CETIS service at the University of Bolton, and KaiNao Ltd. 
It is specially developed for the educational market of the United Kingdom. 

The core development of XCRI is the XML spécification XCRI Course Ad- 
vertising Profile (XCRI-CAP), which describes course-related information 
that encompasses course quality assurance, course marketing, course enrol- 
ment and requirements 14 . The XCRI-CAP format allows learning providers 
to publish their course information, so that it can be easily collected by orga- 
nizations with course search services such as UCAS 15 (UK’s national course 
information and applications aggregator). 

Before developing the XCRI Course Advertising Profile, the project group 
reviewed existing course information standards to assess their suitability for 

13 http://www.jisc. ac.uk 

14 http://www.xcri.org 

16 http://www.ucas. ac.uk 
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the needs of the UK [34, pp. 4 and 6]. The most promising spécification 
was the XML schéma of the Norwegian CDM. The decision to build on 
the CDM work was also based on its particular attention to the European 
Crédit Transfer Scheme (ECTS). However, some structural problems which 
compromised CDM’s ability to fulfill the needs of UK’s educational Systems 
were identified: 

• In CDM, the course spécification and its offering are co nfl ated. How- 
ever, several UK institutions need a séparation of these two éléments, 
as a spécification can be offered many times or in multiple locations. 

• CDM’s hierarchy of programs and courses is too strict for the UK’s 
educational institutions. There would be a need for more flexibility, 
as, for example, a possibility to add more éléments between CDM’s 
program and course objects. 

Besides CDM, other standards, like the Swedish EMIL spécification, were 
also analyzed. EMIL incorporâtes the missing features of CDM, but it is not 
well applicable for realizing XCRI’s objectives of addressing quality, mar- 
keting, reporting and enrolment requirements [34, p. 4]. Therefore, XCRI 
decided to develop its own spécification. Nevertheless, XCRI still coopérâtes 
with the CDM and EMIL teams to advance understanding of curriculum 
metadata requirements and possibilities [30, p. 3]. 

XCRI-CAP provides following core éléments, as shown in its XML schéma 16 : 

• Catalog : This element contains providers of courses or educational 
training programs. 

• Provider : This element covers ail relevant information about the in- 
stitution that provides courses or other educational offers. A provider 
can also include one or several courses. 

• Course : This element handles information about the course, such as 
title, description, qualification, and is comprised of présentations. 

• Présentation : This element includes description, start, end, duration, 
study mode, costs, language, entry requirements, available places, and 
venues. 

• Venue : This element covers description, address, phone, email, URL of 
the venue of a présentation. 

10 http://www.xcri.org/Tools. html 
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In addition to the national System UCAS, many universities started projects 
using XCRI to publish and compare their educational offers. Examples are 
collected on the website of JISC 17 . 


2.1.4 PAS 1068 in Germany 

The Publicly Available Spécification (PAS) 1068, which was published in 
January 2007, is a guideline for the description of educational offers. It is 
not subject to national, European or international standardization. 

PAS 1068 is described in [14, pp. 3-4] in the following way: 

With this PAS, the "Transparency in e-Learning" working group 
at the German National Standards Body DIN 18 provides a de- 
scription scheme that allows the providers to describe their edu- 
cational offers in terms of a "leaflet." The minimum set of data 
is specified to standardize the description of educational offers 
and to enable their comparison. [...] The PAS is applicable to ail 
processes in learning, éducation, and training, and particularly 
includes the considération of e-Learning. 

The description scheme defines criteria which are mandatory, optional or 
optionally mandatory (they hâve to be provided if they exist) and how they 
should be described (yes/no-answer or detailed information). PAS 1068 also 
uses references to the IEEE spécification Learning Object Metadata (LOM) 
and to the German PAS 1045 (Further éducation and professional training 
databases and information Systems — criteria of the contents and for data 
exchange formats). As PAS 1068 especially considers e-Learning, it offers 
many criteria concerning technical aspects, data recording, data processing, 
accessibility and functional aspects that are less available in the aforemen- 
tioned spécifications. With the information structured by the guideline of 
PAS 1068, providers cannot only describe their educational offers, but also 
promote them and make them comparable. The description scheme of PAS 
1068 is usually constructed in form of a table, but it is also provided as an 
XML binding 19 . 

2.1.5 PESC in the USA 

Not only within Europe does the harmonization and easier exchange of course 
information need to be achieved. In 1997, the Postsecondary Electronic Stan- 

17 http://www.jisc. org.uk/whatwedo/programmes/elearningcapital/courseinfo.aspx 

18 http://www.din. de 

19 http://www. qed-info.de/PAS (in German) 
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dards Council (PESC) 20 was founded in Washington, D.C. to lead the es- 
tablishment and adoption of data exchange standards in éducation. Its goals 
are to enable the improvement of institutional performance and foster col- 
laboration across educational communities in order to lower costs, improve 
service, and attain System interoperability. 

PESC focuses on the development of electronic standards for student-related 
information, but it also considers information about courses in its spécifica- 
tions. The first standards in higher éducation were already developed before 
PESC in the early 1990’s, and they were based on the EDI (electronic data 
interchange) transcript standards. These standards include the Educational 
Course Inventory (Transaction Set 188), which is used by postsecondary ed- 
ucational institutions to transmit course information. Nowadays, admissions, 
financial aid, and registrar communities are developing XML standards under 
PESC. In 2005, the development of a Course Catalog in XML was initialized 
to include institutional and curriculum information in the XML Postsec- 
ondary Transcript of PESC [37]. PESC also observes the development of the 
XCRI in the United Kingdom, and incorporâtes parts of it in its Course 
Catalog. However, the XML Course Inventory workgroup has been inactive 
since January 2009. 


2.2 History and Development of MLO-AD 

As described in Section 2.1, the existing spécifications for providing infor- 
mation about learning opportunities are ail national standards, and are not 
used across different countries. 

In 2004, CEN/ISSS WS-LT 21 (Workshop on Learning Technologies at the 
European Committee for Standardization) proposed a project for the harmo- 
nization of the existing European spécifications, which are CDM, CDM-fr, 
EMIL, XCRI and PAS 1068. Therefore, a European-wide standard should 
be created to describe learning opportunities. 

After CDM did not get funding by the Directorate General (DG) Industry 
of CEN [38, p. 4], a group of experts in this field continued the develop- 
ment of this harmonization on a voluntary, unpaid basis. In November 2007, 
représentatives from European and US universities and other institutions 
met in Rome to discuss future common data standards for exchanging stu- 
dent curriculum data. Requirements for harmonization of course description 
were summarized by a technical committee of CEN in the following so-called 
"Athens Déclaration" [31]: 

20 http://www.pesc.org 

21 http://www.cen.eu/isss/Workshop/lt 
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• There is a considérable interest in many countries in Europe in creating 
spécifications for the exchange of information about courses and other 
learning and training opportunities. 

• There is a clear scope for greater harmonization of these efforts within 
a European context. 

• Ail existing national initiatives will benefit from contributing towards 
harmonization at a European level. 

• There are sufficient clear commonalities across existing national initia- 
tives for future European standards to be developed. 

• Harmonization should balance the benefits of common standardization 
with the necessity of meeting local contextual needs and infrastructure. 

• Harmonization efforts should focus on small, simple models based upon 
existing commonalities that can be expanded upon at national or ré- 
gional level, rather than all-inclusive, monolithic standards. 

The Athens Déclaration is also applicable to student curriculum data which 
cover ail personal information, courses attended, and grades attained by the 
student. 

In December 2007, the group of experts decided to build the new standard 
on the core éléments of CDM and changed the name of the spécification 
to Metadata for Learning Opportunities (MLO) [38, p. 7]. The focus of the 
project is the advertising of learning opportunities, and therefore, the de- 
velopment of the MLO- Advertising (MLO- AD) standard, which is described 
in [39, p. 2] as follows: 

MLO- AD is a standard addressing metadata sufRcient for adver- 
tising a learning opportunity. The goal of MLO- AD is to provide 
information about a learning opportunity, to enable the learner 
to make a decision if there is a need for more information about 
the learning opportunity, and where to find that information. The 
group also aims at developing a lightweight standard which is de- 
signed to facilitate semantic technologies and web architectures 
to support several mechanisms for exchange of the information 
and aggregation of information by third party service supplier s. 

CEN endorsed MLO- AD as a CEN Workshop Agreement (CWA) in October 
2008 22 23 and committed itself to the further development of the spécification 

22 http://www.cen.eu/cenorm/standards_drafts/index. asp 

23 http://zope. cetis.ac. u k/mem bers/scott/ blogview?entry=20081021 140752 
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into a formai European standard known as the European Norm (EN). After 
the agreement upon an EN (this development may take up to two years), 
MLO-AD will be a standard in ail 30 member countries of the CEN. 

MLO-AD should not be limited to Europe. GTN-Québec 24 is a working 
group whose mission is to provide the educational community with expertise 
in the area of e-Learning object standards in order to promote the création 
and enrichment of an educational héritage for Québec’s éducation commu- 
nity, as well as that of the Francophone world. GTN-Québec aims to propose 
an application profile based on the MLO-AD spécification to the community. 
Additionally, it provides working examples of XML binding and RDFa/mi- 
croformat, as well as implémentation guidelines. Different interest groups 
collaborate to reach the targets of GTN-Québec. One of them is the Com- 
puter Research Center of Montréal (CRIM) 25 , which also provides services 
for éducation and training. The CRIM Training Center aims to fulfill the 
training needs of businesses and organizations in the field of information 
technology 26 . 

MLO-AD présents an abstract model for representing and advertising learn- 
ing opportunities. The data model is based on the object of the learning 
opportunity and spécifiés three resources, which are the provider, the spécifi- 
cation and the instance of the learning opportunity. The objects of MLO-AD 
are described in [39, pp. 6-7] in the following way: 

• Learning Opportunity (LO): A chance to participate in éducation or 
training. 

• Learning Opportunity Provider (LOP): An agent (person or organiza- 
tion) that provides learning opportunities. 

• Learning Opportunity Spécification (LOS): An abstract description of a 
learning opportunity, consisting of information that will be consistent 
across multiple instances of the learning opportunity. 

• Learning Opportunity Instance (LOI): A single occurrence of a learning 
opportunity. Unlike a Learning Opportunity Spécification, a Learning 
Opportunity Instance is not abstract, may be bound to particular dates 
or locations, and may be applied for or participated in by learners. 

Thereby, the following issue must be regarded as described in [39, p. 3]: 

The model proposed within the standard is not intended to define 
the electronic représentation of learning objects in general — the 

24 http://www.gtn-quebec.org 

25 http://www.crim.ca 

26 http://www.crim.ca/en/services/Formation 
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Figure 2.1: Illustration of the MLO domain model (according to [39, p. 6]) 


scope of the standard is restricted to defîne the electronic repré- 
sentations of learning opportunities to facilitate their advertising 
and subséquent discovery by learners. 

The model of the MLO and the associations of its objects are illustrated on 
Fig. 2.1. During the création of the model design, attention was also drawn 
to supporting the ECTS description and the exchange of ECTS information. 
The standard only describes the model, and does not address the vocabular- 
ies needed to ensure semantic interoperability between different educational 
and jurisdictional domains. The reason for not dealing with vocabularies is 
based on the need for frequently updating and maintaining the vocabularies. 
Therefore, ail vocabularies will be maintained as separate CEN Workshop 
Agreements (CWAs) by the CEN/ISSS WS-LT [39, pp. 2-3]. 

Further developments of the MLO are specified in [39, p. 3] as follows: 

In the future, the MLO set of standards will be further devel- 
oped to describe Metadata for Learning Opportunities related to 
the Europass System used throughout Europe. Based on other 
needs for metadata related to Learning Opportunities, new stan- 
dardization projects [besides the project for advertising learning 
opportunities; author’s note] could also be launched. 
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2.3 Existing Portais for Learning Opportunities 

Often, learners want to get information about learning opportunities from 
several educational institutions. Therefore, many use portais which collect 
course and program data from multiple institutions and présent them in a 
comprehensive and structured way. Some platforms which allow learners to 
search for learning opportunities across one or more countries already ex- 
ist. Examples of sites which help to find a course or training program are 
SchoolFinder.com 27 , UCAS in the UK 28 , FindaCourse.com 29 , Ploteus 30 (an 
EU project), and FastTomato.com 31 . This section describes three représen- 
tative portais for learning opportunities more precisely. 


2.3.1 Moveonnet 

Unisolution 32 is a software and Consulting company for institutions of higher 
éducation, founded by students of the Technical University Darmstadt in 
Germany. In 2006, it launched the web portai moveonnet 33 , which provides 
a comprehensive directory of higher éducation worldwide. According to the 
information on its homepage, moveonnet is especially relevant for interna- 
tional relations officers, international students, and international researchers. 

The base of the portai of moveonnet is the Worldwide Directory of Higher 
Education, which provides the information about higher éducation in four 
areas 34 : 

• Institutions of higher éducation, including general information, con- 
tacts, partner lists, information for exchange students, ranking posi- 
tions, map locations, and more. 

• Countries, including general information, regions/states, higher éduca- 
tion System, institution types and list of institutions. 

• Networks, including general information, aims, contacts and members. 

• Intensive courses, such as summer courses or language courses, includ- 
ing description, modalities, and contacts. 

27 http://schoolfinder.com 

28 http://www.ucas. ac.uk 

29 http://findacourse.com 

30 http://ec.europa.eu/ploteus 

31 http: / /www.fasttomato.com 

32 http://www.unisolution.eu 

33 http: //www. moveonnet. eu 

34 http://www.moveonnet.eu/abo ut 
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Figure 2.2: Screenshot with seaxch results of moveonnet’s international pro- 
grams 


The user interface for searching in the worldwide directory for international 
programs includes input and select fields for the keyword, the program type 
(Bachelor, Master, PhD, or summer school), and the desired country. Fig. 2.2 
shows the results of a search for Master programs in Computer Science. 

Each institution is presented to students and partners in a standardized 
and comparable way. An institution can register onto moveonnet without a 
registration fee. The institution itself is responsible for the up-to-dateness of 
its data and can enter and update the information anytime, free of charge. 
More than 1,000 institutions of higher éducation are already registered on 
moveo nn et. 

Moveo nn et also provides a Search Plugin, which can be installed for the 
web browser to directly search the worldwide directory of higher éducation 
without accessing the website. 

2.3.2 CourseAtlas and CollegeTransfer 

The company AcademyOne 35 , located in Pennsylvania, United States, was 
founded in 2005 with the target to enable new forms of academie collabo- 
ration on the Internet. "We are dedicated to improving the underlying in- 
frastructure, applications and processes supporting the diverse array of edu- 
cational institutions while reducing redundancy, complexity and cost," says 


’http://www.academyone.( 
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CollegeTrarisfer. net 



Figure 2.3: Screenshot of CollegeTransfer, showing search results with the 
course title "Computer Science" 


David K. Moldoff, founder of AcademyOne, on the company’s web page 36 . 
AcademyOne offers two web-based platforms for students: CourseAtlas 37 and 
College Transfer 38 . 

CourseAtlas is a searchable course database with more than three million 
courses offered by over 4,000 colleges and universities within the United 
States. Courses can be searched free of charge by ID, title, description, sub- 
ject, school name or location. Once a year, AcademyOne updates and ag- 
gregates the information about courses of colleges and universities. However, 
institutions can submit their course catalogs to AcademyOne anytime to 
hâve them included in CourseAtlas. 

After two years of research, CollegeTransfer was launched in 2007 to help 
ail involved parties in the process of student and academie crédit transfer 
from one to another college or university. This comprehensive portai offers 
possibilities for students to evaluate transfer options and to save and share 
information about them, their attended courses, and their crédits in a secure 
way. Institutions of higher éducation can use CollegeTransfer to manage the 
workflow of a college transfer, to build partnerships, and to advertise and 
share their course information. It also provides information about how aca- 

36 http://www.academyone.com/AboutUs/OurMission/tabid/199/Default.aspx 

37 http://www.cou rseatlas.com 

38 http://collegetransfer.net 
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demie crédits can be evaluated to be acceptable for different educational 
providers. CollegeTransfer also uses the database of CourseAtlas to facili- 
tate the search of courses and schools. Fig. 2.3 shows the user interface of 
CourseAtlas in CollegeTransfer and the resuit of a search by keywords. 

AcademyOne uses PESC standards for its platforms to collect and display 
information about courses and students. 


2.3.3 Hotcourses 

The history of the Hotcourses 39 course guide began in 1996 in paper form, 
when the Hotcourses magazine was published in London. In 2000, the online 
version Hotcourses.com followed, offering information about 50,000 courses. 
Nowadays, the database covers information about more than one million 
courses from more than 17,000 providers within the United Kingdom. There- 
fore, Hotcourses can be designated as UK’s largest course finder. Besides its 
headquarters in London, Hotcourses opened offices in India 40 (responsible for 
Uniguru.com — a study abroad site for Indian students, offering information 
of over 350,000 courses), Australia 41 (more than 45,000 courses) and South 
Africa 42 (over 2,000 courses). 

Hotcourses offers a simple search mask which is permanently available on 
the website. The user can enter a subject keyword, course type (part-time, 
full-time without degree, undergraduate, postgraduate, MBA, home study), 
town or postcode. After the results of a search query hâve been displayed, the 
search can be further refined by defining a preferred qualification, duration 
or hours of study. Fig. 2.4 shows the results of a search for postgraduate 
medicine courses. 

Providers of learning opportunities can sign up on Hotcourses.com for free. 
It is also in their hands to update the information about their courses and to 
ensure that their listings remain accurate. Every week, Hotcourses collects 
data of new courses to add them to the site. Information about undergrad- 
uate courses is provided by the UC AS (UK’s national course aggregator), 
which uses the XCRI standard. 

39 http://www.hotcourses.com 

40 http://www.hotcourses.co. in 

41 http://www.hotcourses.com.au 

42 http://www.hotcourses.co.za 
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Figure 2.4: Screenshot of Hotcourses with the search resuit of postgraduate 
medicine courses in the UK 


2.4 Existing Aggregators 

A challenge in organizing a database is to keep the data up-to-date at ail 
times and to gather recently emerged information. To solve this problem, it 
needs to hâve an automatic aggregator System which harvests data from its 
sources at regular intervals. 

Aggregators hâve been used in many different applications. One example is 
an application which collects bank account data, investment account data, 
and other account data on one page. Nearly every bank that offers e-Banking 
supports this feature, and consolidâtes ail the different accounts of a client. 

Sites that provide price comparisons of different suppliers also use aggrega- 
tion functionality to collect this data. Examples for these web applications 
are PriceGrabber.com, PriceRunner.co.uk, Shopbot.ca, and Geizhals.at. 

Certain e-mail scanning software, like Microsoft Outlook and Mozilla Thun- 
derbird, can act as "e-mail aggregators," because they show newly received 





2. State of the Art 


20 


messages from multiple mailboxes of a user. Consequently, the user does not 
need to check each mailbox separately. 

Another category of aggregators are "media aggregators," which can auto- 
matically download video and audio files due to "podcasts." Podcasts are 
spécial types of RSS feeds, where RSS is a standardized format to publish 
frequently updated content [42, p. 78] (Therefore, these aggregators are also 
called "Podcatchers"). Examples for media aggregators are Apple iTunes, 
Nullsoft Winamp, and Microsoft Zune. 

Three of those aggregator Systems that collect data regularly and automat- 
ically are described in this section more specifically. 

2.4.1 Google News 

In 20 0 2 43 , Google launched the News Aggregator Google News 44 , which is 
described on its homepage in the following way: 

Google News is a computer-generated news site that aggregates 
headlines from more than 4,500 English-language news sources 
worldwide, groups similar stories together and displays them ac- 
cording to each reader’ s personalized inter ests. 

Fig. 2.5 is a screenshot of its homepage, showing the main articles and head- 
lines, as well as their sources. Google News supports many different languages 
and provides more than 40 régional éditions. In general, Google News aims 
to promote original journalism. This is achieved by harvesting from profes- 
sional news sites. The sélection and ranking of the articles in Google News 
are totally automated and, therefore, not influenced by human editors. The 
ranking is mainly based on following factors 45 : 

• Freshness of content 

• Diversity of content 

• Rich textual content which would help users searching for information 
to find the articles 

The only human éditorial input into the System is the list of sources which 
Google News harvests the articles from. However, Google accepts URLs of 
other news sites if a user wishes to include them in Google News. Users can 

43 http://googleblog.blogspot.com/2006/01/and-now-news.html 

44 http://news.google.com 

46 http://www.google.com/support/news_ pu b 
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Figure 2.5: Screenshot of Google News 


customize their Google News site and choose preferred sections, their loca- 
tions and the number of displayed articles. Google also provides an easier 
access to updates about favorite topics via RS S and Atom feeds. After sub- 
scribing to a Google News feed, users will regularly receive a summary of 
new articles they are interested in. 


2.4.2 We Feel Fine 

The independent artwork We Feel Fine 46 was developed by Jonathan Harris 
and Sepandar Kamvar with the target to create "an exploration of human 
émotion on a global scale." Since 2005, their data collection engine has au- 
tomatically harvested human feelings from a large number of weblogs every 
ten minutes. Therefore, the System searches the entries of weblogs for the 
occurrences of the phrases "I feel" and "I am feeling" and stores the found 
sentence in its database. After that, it identifies the "feeling" in that sen- 
tence (for example, happy, sad, angry) with a provided list of about 5,000 
pre-identified "feelings." Moreover, the System saves the time the blog en- 


’ http://wefeelfine.org 
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Figure 2.6: Screenshot of the We Feel Fine applet in the so-called "Madness" 
movement 


try was written, the geographical location, the âge, and the gender of the 
author, if this information is provided or can be extracted from the blog 
System. Even the local weather conditions are calculated from the data of 
time and location. If the blog entry also contains an image, this image will 
be saved along with the sentence. Accordingly, the aggregator System of We 
Feel Fine collects 15,000 to 20,000 new feelings per day. This data can be 
queried with the We Feel Fine applet in six different statistical movements. 
One of these movements is called "Madness," which was designed to show 
the feelings of the human world from a bird’s eye view. This movement is 
illustrated in Fig. 2.6. 

Jonathan Harris and Sep Kamvar created the data collection engine of We 
Feel Fine with Java, Perl, MySQL and Apache. The Processing software 47 by 
Ben Fry and Casey Reas is used for the applet. The code of We Feel Fine is 
closed source. However, the data of the collected feelings are freely available 
through the public API of We Feel Fine. 


? http://www.processing.org 
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Figure 2.7: Screenshot of FriendFeed showing information shared by every- 
one 


2.4.3 FriendFeed 

The web application FriendFeed 48 collects shared information of users and 
their friends from different sites the users hâve chosen to aggregate. There- 
fore, the users are always up-to-date on web pages, photos, videos and music 
that their families and friends are sharing. FriendFeed also provides possibil- 
ities to leave comments on shared information or to start discussions among 
friends. 

More than 40 sites are supported by FriendFeed, e.g., Amazon.com, Dailymo- 
tion, Facebook, Flickr, Gmail, Google Talk, Mister Wong, SlideShare, Twit- 
ter, YouTube, and blogs. The user usually just has to provide the username 
of a site; FriendFeed automatically finds and aggregates the public activity 
of this site using web crawling technologies similar to those of search engines. 
For each user, FriendFeed créâtes a web feed customized to the content the 
user’s friends shared. This feed includes links to the sources of the shared 
information. 

FriendFeed also offers rooms for sharing and discussing content concerning 
a spécial topic. Users hâve the possibility to create their own rooms for a 


Tttps://friendfeed.com 
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spécifie topic or to allow just a few friends to see their shared information. 
Content can also be public in FriendFeed, which means that it is shared 
with everyone. Fig. 2.7 shows a screenshot of FriendFeed displaying publicly 
shared information. 

With the provided API of FriendFeed, developers can interact with the web- 
site and use the shared information. They can also develop an interface for 
a mobile device or integrate FriendFeed into a web application. 



Chapter 3 

Requirements on the MLO-AD 
Aggregator Portai 


The majority of the existing portais allowing a comparison of learning op- 
portunities lists courses or programs, but does not provide much information 
about other learning events, such as forums or conférences. Except for the 
spécification of time and location for this kind of training, further details are 
usually not available. 

Moreover, existing portais often consider learning opportunities within one 
country only. This is due to the fact that a comparison of offers from multiple 
countries is more difficult by reason of different educational Systems and 
various data formats that are used by training institutions for describing 
their offers. 

The MLO-AD standard ensures a harmonization of data models describing 
learning opportunities across ail countries that accepted this standard (see 
Section 2.2). MLO-AD defines which information must and can be provided 
about each opportunity to make them comparable and to provide compré- 
hensive information for the user. Therefore, the need for an international 
comparison of ail kinds of learning opportunities can be fulfilled by a System 
based on the MLO-AD standard. 

This System must not only collect and provide the information about learning 
opportunities, but also keep it up-to-date. For example, subjects and contents 
of courses or training programs can be revised, starting dates can be updated 
for every new session or costs can be adjusted. Therefore, data of learning 
opportunities must be updated frequently and automatically. An aggregator 
System provides up-to-date information due to its automatic data harvesting 
functionality. 


25 
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3.1 Target Audience 

There are two kinds of key users of an aggregator portai which is based on 
MLO-AD and provides up-to-date information about learning opportunities: 
Prospective learners compare different programs and courses, while providers 
of learning opportunities advertise their offers. 


3.1.1 Prospective Learners 

The primary target audience of the portai based on the MLO-AD standard 
is comprised of learners who want to search, browse, and compare informa- 
tion about electronically represented learning opportunities. People who are 
interested in learning opportunities use the aggregator portai as a search 
engine to receive this spécial information. 

The MLO-AD standard aims to address ail types of learners: Children and 
their parents search for learning opportunities to achieve spécial skills or 
to hâve another leisure activity; pupils and students are looking for courses 
and study programs for their éducation; professionals want to expand their 
knowledge with further training; companies offer conférences or training ses- 
sions for their employées; or people just want to do courses in topics they 
are interested in. The MLO-AD defines various properties to differ entiate 
between the types of learning opportunities, which are described in the fol- 
lowing way [39, pp. 9-11]: 

• Qualification defines the qualification which can be obtained from com- 
pletion of a learning opportunity. 

• Crédit includes an account of crédits than can be obtained from com- 
pleting a learning opportunity. 

• Level indicates the intended outcome of the learning opportunity in 
terms of progression. 

• Engagement describes how individuals engage in a learning oppor- 
tunity. It encompasses temporal, modal and spatial patterns of en- 
gagement and attendance (e.g., overall attendance includes full-time 
and part-time; modes of study, which can be distance, campus-based, 
workplace-based, or online; pattern of attendance hours, like evenings, 
daytime, weekend). 

• Objective is the aim or learning objective for the learning opportunity. 

• Prerequisite describes the entry requirement for accessing the learning 
opportunity. 
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The aggregator portai based on the MLO-AD standard is tailored at first 
instance to professional learners and postgraduates who aim to expand their 
knowledge with professional courses. It can be assumed that the target au- 
dience has good computer or media skills, and is familiar with the use of 
search engines. The required information about learning opportunities and 
adéquate user interfaces for ail kinds of prospective learners should be pro- 
vided in future development. 

An enrollment in a course or any other educational program is mostly dé- 
pendent on several criteria, which could be subject, certification or qualifi- 
cation, location, starting date, duration, costs, language, and prerequisites. 
Prospective learners want to get ail the relevant information about learning 
opportunities for a subject of their interests. Information or content about 
educational programs that can be found in the Internet is often cluttered, 
which increases the need for organizing and accessing the information in a 
way that is useful, educational, and structured. Learners usually try to avoid 
visiting various websites to receive and collect the information they need. To 
get a good survey of opportunities and to be able to compare them, learners 
use the aggregator portai based on the MLO-AD standard. Naturally, this 
portai provides up-to-date information about educational offers. Prospective 
learners enter the information they are interested in into a search mask and 
receive the data of ail available learning opportunities arranged by the data 
model of MLO-AD. 

By providing comprehensive data of various learning opportunities, the ag- 
gregator portai also supports transparency for learners. 


3.1.2 Learning Opportunity Providers 

The second group of key users consists of providers of learning opportunities 
that include universities, colleges, training organizations, as well as experts 
who can offer professional courses. These providers wish to advertise their 
educational offers widely and in an economical way. With a portai based on 
the MLO-AD standard, providers of learning opportunities can reach their 
target audience across multiple countries. 

Again, MLO-AD is a standard for ail types of educational programs and, 
therefore, addresses a wide range of learning opportunity providers. The 
first version of the aggregator portai based on MLO-AD is developed for 
professional courses or training sessions, and accordingly for providers of 
these types of learning opportunities. 

Currently, many aggregators that collect information about learning oppor- 
tunities do not offer international standard formats. This means providers 
usually hâve to input their data manually on web forms or upload them with 
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specialized tools. Due to the fact that they are not based on a standard, ex- 
isting aggregators sometimes require specialized vocabularies or encoding 
schemes (e.g., for identifiers) [39, p. 12]. To be able to provide their data for 
different services and, therefore, to advertise their educational offers, learning 
opportunity providers need to adjust their data model to the requirements of 
the existing service or hâve to provide the information "manually." Provid- 
ing data about learning opportunities can become expensive for providers 
(especially if they want to supply their data for multiple Systems) if they 
cannot base their data model on a widely accepted standard. As a resuit, 
learning opportunity providers will support services based on the MLO-AD 
standard to advertise their educational offers. 

The aggregator portai can collect information about learning opportunities 
automatically from a provider if its data model is based on a standard like 
MLO-AD. The provider does not hâve to cope with updating its data about 
the educational offers on the portai itself. It just has to revise the information 
in its own database as the portai collects and updates this data frequently 
and automatically. This procedure ensures the consistency of the information 
about learning opportunities on ail Systems collecting this data from the 
provider. 

Although the data is collected and updated automatically, providers usually 
still wish to hâve influence on their information which is available on the 
aggregator System. For example, they intend to update data or add new 
offers to the database of the aggregator immediately, and do not want to wait 
for the next update from the System. It is also possible that new providers 
want to advertise their learning opportunities using the aggregator portai 
based on MLO-AD, and need a way to indicate themselves. Therefore, the 
aggregator portai still needs to provide the possibility to input data about 
learning opportunities manually. 

If an educational institution provides its information for the aggregator por- 
tai, learners can search for and browse its courses and programs and become 
aware of the institution. As the portai can be used across multiple countries, 
providers can also reach learners who are out of range through their usual 
advertisement of learning opportunities. 

Learning opportunity providers want to extend the range of their advertise- 
ments not only by covering more countries, but also by using new and popular 
technologies. In March 2009, The Nielsen Company published a report [36] 
dealing with the internet consumer phenomenon Social Networking which 
facilitâtes the building of online communities of people who share the same 
interests. This internet activity is in such great demand that "two-thirds of 
the world’s Internet population visit a social network or blogging site and 
the sector now accounts for almost 10% of ail internet time" [36, p. 1]. Since 
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December 2008, social networking has had an even higher active reach than 
e-mail [36, p. 2], which can be attributed to the fact that many of these ser- 
vices provide e-mail or instant messaging to communicate with other users. 
Social network services like Facebook 1 , MySpace 2 , Hi5 3 , Linkedln 4 , Orkut 5 , 
Xing 6 , and 51.com 7 are counted among the most popular websites 8 9 . 

Obviously, advertisers are interested in the social networking market to ap- 
peal to new consumers. However, according to [36, pp. 5-6], "members hâve 
a greater sense of ‘ownership’ around the personal content they provide and 
are less inclined to accept advertising around it. [ ] Advertising shouldn’t 
be about interrupting or invading the social network expérience, it should 
be part of this conversation." 

Thus, learning opportunity providers can advertise their educational offers 
via social network services wisely if the user gets an added value by in- 
teraction. This can be provided not only through searching for a learning 
opportunity via an integrated gadget of the aggregator portai, but also by 
sharing educational offers with friends. If a user is interested in a spécifie 
training session, his friends may also like to attend this event. 

Additionally, users can build a miniature social network according to a learn- 
ing opportunity. They can exchange expectations, start discussions about 
the subject of the training program or give feedback to other users and the 
educational institution or instructor. The trainer or teacher of the learning 
opportunity can communicate directly with learners before or after a session, 
which is also a new way of advertising an educational event. 

Educational institutions and instructors can advertise their learning oppor- 
tunités in a target-oriented way with a gadget that provides the function- 
ality of the aggregator portai based on MLO-AD and is integrated into a 
social network service. Gadgets are miniature web applications embedded 
into another website. As the first version of the aggregator portai has been 
developed for professional courses, social networks mainly used by profes- 
sionals and postgraduates (like Linkedln or Xing) are well applicable for this 
tool. 


1 http.:/ /ww 

wiacebook.com 

2 http: / /wv\ 

w. myspace.com 

3 http: / /wv\ 

w. hi5.com 

4 http: / /wv\ 

w. Iinkedin.com 

6 http://wv\ 

/w. orkut. com 

6 http://wv\ 

/w.xing.com 

7 http: / /wv\ 

/w. 51. com 

8 http: / /wv\ 

/w. a lexa.com/site/ds/top 


'http://mostpopularwebsites.net 
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3.2 Product 

Currently, none of the existing aggregator portais providing information 
about learning opportunities are based on the MLO-AD standard. This is 
due to the fact that MLO-AD is still young and in the process of being stan- 
dardized. Together with GTN-Québec 10 , the idea of an aggregator portai 
based on MLO-AD was created to demonstrate the potential and practica- 
bility of this new standard and to fulfill the need of international comparison 
and transparency of learning opportunities. 

The aggregator portai must meet certain requirements to corne up to the ex- 
pectations of the target audience. One main condition is the permanent ac- 
cessibility of the information about learning opportunities ail over the place. 
To provide this permanent access and to avoid data inconsistency (which 
can occur if the System cannot harvest the required data frequently), the 
aggregator portai is developed as a web application. 

Further requirements are described in this section more precisely. They are 
separated into System logic (back-end) and user interface (front-end), ac- 
cording to their relation to the System. 

3.2.1 System Logic 

The back-end of a web System includes those components which process the 
output from a user interaction. It covers the logic of a System which happens 
on the server and mostly includes operations on the database as well. These 
procedures are usually hidden from the user. 

The database model of the aggregator portai is based on the MLO-AD stan- 
dard for describing learning opportunities. This ensures a wide acceptance 
of the System, an easy collection of data from providers supporting this stan- 
dard, as well as the possibility to compare ail provided learning opportunities. 

One of the main components responsible for a logical process in the aggrega- 
tor portai is the collection or harvesting of data from learning opportunity 
providers. Educational institutions must provide their data using a data 
model based on MLO-AD, so that the portai can collect this information. 
The web System knows the providers which the information about learning 
opportunities is harvested from. Refer ences to the sources of the collected 
data are saved in a list with the URL of the provider as a unique identifier. It 
also needs the permissions from the providers to be able to access the data of 
the learning opportunities. Different possibilities of harvesting information 
from an educational institution are described in Chapter 4. Considering the 


5 http://www.gtn-quebec.org 
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complexity of MLO-AD (multiple éléments with different data types), more 
questions about data harvesting arise: 

• Which éléments of the standard are collected and must therefore be 
provided by the educational institution? 

• Is there a need to provide vocabularies for certain éléments to make 
them comparable with those of other learning opportunities? 

• How can it be ensured that the providers comply with these vocabu- 
laries? 

• How is it possible to avoid spam and to ensure a good quality of the 
information to describe learning opportunities? 

Section 4.7 recommends a solution to harvest information for the aggregator 
portai based on the MLO-AD standard and considers the mentioned ques- 
tions. 

Besides automatically collecting the data, the aggregator portai provides 
an easy and quick way for manually inputting information about learning 
opportunities. This solution is described in Section 4.1. 

To keep the data of the aggregator portai consistent with the information 
provided by the educational institution, the portai harvests this information 
at regular intervals. The calculation of the frequency of these updates is an- 
other difEculty that has to be solved by the System. Providers of learning 
opportunities usually adjust the data according to the types of their offers. 
There is no need to update and collect the data every week from an educa- 
tional institution which changes its offers twice a year. However, it is always 
possible that the information of learning opportunities changes at any time 
or that new training programs are offered unexpectedly. It needs to know if 
the information of a learning opportunity has changed to avoid the unneces- 
sary updating of data. A solution for collecting the information at adéquate 
time intervals and for avoiding unnecessary data transfer is described in 
Chapter 4. 

While the aggregator portai updates existing data about learning opportu- 
nities, it also considers new educational offers. Therefore, the System must 
be able to identify new learning opportunities. It also detects and deletes old 
and expired information about courses or training programs. As mentioned 
before, the provided content of the aggregator portai based on the MLO-AD 
standard must be comprehensive, up-to-date, and without spam or expired 
information. 

Another logical part of the aggregator portai is the support of targeted 
searches. The System searches its database for learning opportunities ac- 
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cording to the user’s data input. Thereby, it allows the search by various 
keywords and the sorting of results by different criteria. This search proce- 
dure must operate efhciently to be able to quickly provide the user with the 
required information. 


3.2.2 User Interface 

The front-end of a web System manages the user interaction and forwards 
its data to the components of the back-end. It shows the user interface and 
displays results of an interaction given by the logic of the System. 

The user interface of the aggregator portai for learning opportunities must 
allow prospective learners to search and browse for educational offers. The 
website provides a simple and quick way to start the search with two input 
fields: One is for keywords about the subject of the learning opportunity and 
the other one for the desired location. The optional input field for the location 
is a drop-down menu which allows multiple sélections. Additionally, users can 
extend their input mask to provide further information they are interested 
in and to limit the search results. Examples for data to refine the search are: 
name of the educational institution, starting date, duration, costs, language 
of instruction, engagement, prerequisites, and qualification. Advanced users 
can directly input various information about a learning opportunity into the 
main subject field by using defined keywords with the desired values as a 
query. Hence, they can refine their search without extending the mask of the 
input fields for further information. The main user interface of the search 
mask is simple, but functional for the refined search. It is similar to Google, 
which is extendable by Google Advanced Search 11 . As the target audience 
of the prototype only includes professionals and postgraduates, the type of 
program is not considered in the search user interface for the first version of 
the aggregator portai. 

After the search process has been completed, the user interface shows the 
comprehensive information about the detected learning opportunities and 
their providers in an appropriate way. Thereby, users hâve the possibility to 
change the view according to the information they are especially interested 
in. Usually, the displayed results are grouped by the institutions, which are 
the providers of the learning opportunities. However, if prospective learners 
want to attend an educational program just within a spécial location, they 
can switch the view to see the results displayed with flags on a map, as 
in Google Map. Another possibility for an alternative view is to show the 
results within a calendar according to the starting date and duration of the 
learning opportunity. The calendar view additionally provides users with a 

11 http://www.google.com/advanced_search 
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good survey about whether an educational institution offers multiple sessions 
or dates for a training program. 

Moreover, users are able to sort the results according to the information of 
their spécial interests. This can be a sort by program type, costs, starting 
date, duration, or any other criterion supported by MLO-AD. Users can 
adjust the présentation of the search results to their needs. These possibilities 
facilitate the transparency and comparison of learning opportunities for each 
learner. 

The website of the aggregator portai based on MLO-AD also supports a 
browsing of learning opportunities. The browser user interface divides the 
educational offers according to their subjects and categories. Moreover, the 
System allows a pre-selection of the location or country. Within a category, 
the user can also fetch a spécifie number of upcoming learning opportunities 
with the selected subject. 

The aggregator portai based on MLO-AD is additionally available as a gad- 
get for social network services. This ensures a higher and targeted reach of 
prospective learners, because of the intégration into popular social network 
services that are used by the target audience. The gadget is a miniature of 
the aggregator portai, and the user interface of the search mask is the same. 
If a user searches for an educational event via the gadget, he/she will receive 
only a limited number of results. More, resp., ail results are accessible via 
a link to the "normal" user interface of the aggregator portai. The gadget 
does not support the browsing of learning opportunities. 

However, it provides other services matched to users’ behaviors in social 
networks. The aggregator portai integrated in a social network allows users 
to share learning opportunities with their friends or contacts. They can in- 
vite or encourage each other to participate in a spécifie educational event. 
A click on the button "Share with Contacts" near the learning opportunity 
data opens the list of ail the users’ contacts. Accordingly, users can choose 
which contacts they want to share the spécifie training with. Users can also 
add a session of a learning opportunity to their "activities". Consequently, 
ail their contacts can note that they hâve been attending these events. Espe- 
cially in social networks such as Linkedln and Xing, which are mainly used 
by professionals, users specify their abilities, éducation, and training pro- 
grams they participated in. Information about learning opportunities from 
the aggregator portai based on MLO-AD can be used as a reference. 

Groups connected with a learning opportunity can also be created or partic- 
ipated in. This feature gives users the possibility to discuss an event, post 
comments, or ask questions to the instructor (who is also a user of the social 
network) . 
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Beside the solutions to difficulties relating to the user interface, Chapter 5 of 
this work describes the intégration of the gadget into social network services. 


3.3 Success Criteria for the Prototype 

The first step of developing the aggregator portai based on the MLO-AD 
standard is the création of a prototype. Chapter 6 of this paper includes the 
analysis of the prototype development process and Chapter 7 évaluâtes the 
results according to the requirements. 

The prototype must comply with the following criteria: 

• The database and its data model for describing learning opportunities 
are based on the MLO-AD standard. 

• Data about educational offers are harvested from at least two learning 
opportunity providers. 

• One of these providers is the training center of CRIM 12 . 

• The portai aggregates the information automatically and in adéquate 
time intervals. In this process, it not only updates its existing data, 
but also covers information about new, recently emerging educational 
offers. 

• The portai allows targeted searches for learning opportunities. 

• The prototype is implemented as a gadget for social network services. 

• The gadget provides a user interface which allows the user to enter the 
information about a learning opportunity he/she wants to search for. 
The search mask consists of the main input field for keywords about 
the subject. 

• The results of a search for learning opportunities are shown in a simple 
way. The possibilities to change the view of the given results or to sort 
the detected data by spécial criteria are not priorities for the prototype. 

The aggregator portai based on the MLO-AD standard is developed as an 
open source project in order to encourage the further development of the pro- 
totype and to increase the popularity of the System. The project is licensed 
under the Educational Community License, Version 1.0 of Open Source Ini- 
tiative (OSI), who define themselves on their website 13 as "the stewards 

12 http://www.crim.ca/en/services/Formation 

13 http://www.0pens0urce.0rg/a bout 
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of the Open Source Définition (OSD) and the community-recognized body 
for reviewing and approving licenses as OSD-conformant." The terms and 
conditions of the Educational Community License 14 include: 

Permission to use, copy, modify, merge, publish, distribute, and 
sublicense this Original Work and its documentation, with or 
without modification, for any purpose, and without fee or royalty 
to the copyright holder(s) is hereby granted, provided that f. ] the 
following [see full terms and conditions of License for following; 
author’s note] is included on ALL copies of the Original Work or 
portions thereof, including modifications or dérivatives, that [are 
made]. 


1 http://www.opensource.org/licenses/ecll. php 



Chapter 4 

Data Collection from Learning 
Opportunity Providers 


The core functionality of the aggregator portai based on MLO-AD is the 
collection of its data to describe learning opportunités. Data collection is 
carried out automatically and at regular intervals to keep the repository 
of the aggregator portai up-to-date. Consequently, the learning opportunity 
provider does not hâve to cope with managing its information provided on 
the aggregator portai. 

This chapter explains different technologies to collect information from data 
repositories. The core functionalities of these technologies are described to 
gain a better insight into which circumstances they are applicable for data 
specified by MLO-AD. Each section also includes an analysis of the usability 
of the spécifie technology for MLO-AD. 

In the future, the aggregator portai will support different ways of collecting 
data about learning opportunités to address more educational institutions 
and facilitate the provision of their information. 


4.1 Manual Provision of Data 

Usually, the aggregation of information about learning opportunités and 
their providers is carried out automatically and regularly via a data collection 
technology. However, spécial cases can occur, so that learning opportunity 
providers wish or need to deliver their information manually. 

One of these cases could be an unexpected update or création of a learn- 
ing opportunity, whereas the educational institution wants to submit these 
changes immediately to the aggregator System without waiting for the next 
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automatic update by the System. Another possibility could be that a new 
learning opportunity provider wants to advertise its offers using the aggre- 
gator portai based on MLO-AD and needs a way to indicate itself. And, of 
course, errors which cannot be detected or solved by the System (e.g., missing 
or conflicting data, wrong mapping of information about a learning opportu- 
nity object to an MLO-AD property) can occur during the data harvesting 
process. In this case, the learning opportunity provider or the administrator 
of the aggregator portai has to update the information manually. 

The most common way for manually providing the data on the Web is via 
forms with input fields. The aggregator portai provides a form on a website 
with an input field for each property of a learning opportunity, as well as of 
the educational institution. The form builds the framework of the MLO-AD 
data model and ensures a proper mapping of information to the properties. 
Additionally, it allows default values, spécifie vocabulary, or particular data 
types for certain properties to be defined. However, web forms are often 
misused to clog data repositories with spam or to submit wrong information. 
To retain the quality of the information provided by the aggregator portai 
based on MLO-AD, it needs an identification of the data provider. This 
means an educational institution has to register and log-in before submitting 
data about learning opportunities via the form. 

Another problem relating to the usability of the web form can occur. If an 
educational institution has to provide its information manually, it usually 
wishes to do this in an easy and quick way. By using the web form, the 
learning opportunity provider has to log-in, input ail information into the 
spécifie field of the property, and submit the form for each update. This is 
probably an adéquate effort to modify properties about one learning oppor- 
tunity. However, if the institution wants to update more educational events, 
it will wish to use another possibility for the manual provision of data, es- 
pecially if its offers consist of similar data. 

This alternative possibility of manual data provision is a technique via e-mail 
similar to the Mail-to-Blogger System of Blogger 1 . Blogger users can post a 
blog entry by sending an e-mail to a certain e-mail address. Of course, an 
update or création of a learning opportunity specified by MLO-AD is more 
complex than a post of a blog entry. Therefore, the learning opportunity 
provider needs to use a template, which consists of the MLO-AD properties, 
in the e-mail. This template will be available for download on the web page 
of the aggregator portai. The educational institution adds the spécifie values 
of the learning opportunity to the properties of the template. Obviously, this 
technique is more error-prone, compared to the web form. The aggregator 
portai receiving and processing the e-mail must be able to handle errors, 
which could be not using a template or making mistakes in the template. 

1 http://help. blogger. com/bin/answer.py?answer=41452 
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However, the technique by means of e-mail can be more convenient for the 
data provider, especially when more updates must be submitted. Data in the 
MLO-AD template can be easily copied and modified for another update. To 
ensure the quality of the information provided by the aggregator portai, as 
well as to identify the provider of this information, the educational institution 
has to register once on the System. Each institution gets a unique e-mail 
address, which can be used to submit data by e-mail. An e-mail address for 
each learning opportunity provider allows an easy identification of the e-mail 
sender, which, in turn, is also a security check from the System. 

Both techniques of manual data provision hâve to face the difficulty of dé- 
tection if the input data is an update of a learning opportunity object or the 
création of a new one. Therefore, each learning opportunity needs a unique 
identifier which meets the standardized form of a Uniform Resource Identi- 
fier (URI). This URI includes the path of the data source that also identifies 
the learning opportunity provider. Hence, this information can additionally 
be used to check the quality of the supported data. The aggregator portai 
has to check the URI of a learning opportunity included in the manual data 
provision to see if it is already available in the system’s data repository. If 
so, it will overwrite the available data with the submitted information or 
otherwise create a new resource of a learning opportunity. 


4.2 Web Feeds 

A web feed, news feed or simple feed is a document often based on XML (Ex- 
tensible Markup Language) and used to transfer frequently updated content 
to users [42, p. 78]. This content can vary from weblogs entries, or informa- 
tion about video and audio files (these feeds are called "podcasts"), to news 
items or any kind of content that can be packaged into a unit. Feeds hâve 
gained in popularity with the increased use of weblogs. 

Many aggregators use feeds to collect data from multiple sources (see Sec- 
tion 2.4). Before an aggregator can harvest information via web feeds, a 
content provider has to produce a feed and publish the feed link on its web- 
site. The feed is updated every time the content changes. The aggregator 
(also called "feed reader") subscribes to the feed by saving this link in its 
feed list next to links of other feed providers. At scheduled intervals or when 
instructed, the aggregator itérâtes this list and asks for new content. If new 
content is available, the feed reader downloads this information by using the 
links to the sources saved in the web feed. 

Web feeds are processed through pull technology, which means that the 
user (or his/her aggregator) is responsible for receiving new content from 
a provider. The most popular XML news feed formats are RSS and Atom. 
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4.2.1 RS S 

RSS was first released in 1999 and had developed during the following three 
years into two main branches: RSS 1.0 is based on the Resource Description 
Framework (RDF), which is part of the Semantic Web, while RSS 2.0 is the 
current heir to the line of XML formats [42, p. 78]. This section deals with 
RSS 2. 0 2 , whereas RSS is the abbreviation for Really Simple Syndication. 

The basic element of the RSS XML document is the rss version="2 . 0" 
element which is followed by a single channel element. The latter represents 
the source of the feed and contains the entire feed content as well as ail 
associated metadata [17, Sec. 4.2]. A channel is described by three required 
éléments, which are title, link, and description, and can include any 
number of item éléments [42, p. 79]. Optional éléments of channel can 
be, for example, language, copyright, category, pubDate (which is the 
publication date of the content), lastBuildDate (the date and time when 
any item of the RSS feed was last changed), and image (which has sub- 
elements including url, title, and link). An item element contains the 
primary content of the feed [17, Sec. 4.2] and must include at least a title 
or a description. Other sub-elements of an item are, amongst others, link, 
author, category, pubDate and source (from which the item was derived). 

The following XML file is a simple RSS feed describing two learning op- 
portunities (whereas it does not use the spécifications of MLO-AD). Entity- 
encoded HTML is used in the description éléments. 


1 <?xml version="l . 0"?> 

2 <rss ver s ion= " 2 . 0 " > 

3 <channel> 

4 <title>CRIM Training Center</title> 

5 <link>http : //www . crim. ca/en/ services/Formation</link> 

6 <description>The CRIM Training Center provides a highly effective 

learning environment that allows learners to benefit from 
superior quality training . </description> 

7 <language>en-us</language> 

8 <pubDate>Mon, 2 Mar 2009 10:12:12 EST</pubDate> 

9 <lastBuildDate>Mon, 2 Mar 2009 10:12:12 EST</lastBuildDate> 

10 <item> 

11 <title>0bject-0riented programming with C#</title> 

12 <link>http: //www . crim. ca/en/ services /Format ion/Cour s -inscript ion/ 

index. html?uri=/en/ services/Format ion/Cours -inscript ion/ 
recherche .html&id=NET513en</link> 

13 <description>&lt ;p&gt ;&lt ; i&gt ;C#&lt ;/i&gt ; is the most popular 

language in &lt ; i&gt ;Microsof t .NET&lt ; /i&gt ; . It is a pure 
object oriented (00) language: to develop with C#, we must 
think in terms of objects. Global variables and global 
functions don't exist : everything is a class . Therefore, to 


1 http://cyber.law. harvard.edu/rss/rss.htr 
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develop in C# we must use 00 concepts and know how to apply 
them eff ectively . &lt ; /p&gt ; </description> 

14 <pubDate>Mon, 2 Mar 2009 09:05:15 EST</pubDate> 

15 </item> 

16 <item> 

17 <title>0verview ISO 20000</title> 

18 <link>http: //www . crim.ca/en/ services /Format ion/Cour s -inscript ion/ 

index. html?uri=/en/ services/Format ion/Cours -inscript ion/ 
recherche .html&id=ITI611en</link> 

19 <description>&lt ;p&gt ;This présentation offers a good overview of 

the only international standard that offers official 
récognition for a business in IT service management . &lt ; /p&gt 
;</description> 

20 <pubDate>Mon, 2 Mar 2009 10:06:01 EST</pubDate> 

21 </item> 

22 </channel> 

23 </rss> 


Like XML, RSS is also extensible. This means that new éléments which are 
not declared by the RSS spécification can be added. The only condition 
for extending an RSS is to define the added éléments in a namespace. This 
possibility makes RSS files more flexible, but also more difücult to parse and 
read [21, pp. 67-68]. 


4.2.2 Atom 

The Atom Publishing Format (simply called Atom) was released as an Inter- 
net standard in 2005 by the Internet Engineering Task Force (IETF) 3 [21, 
p. 70]. The current version is specified as Atom 1.0 4 . 

Although the semantics of Atom 1.0 are similar to RSS 2.0, they hâve a 
different naming scheme [42, p. 83]. The following Atom document describes 
the same content as the previous RSS 2.0 file. 


1 <?xml version="l . 0" encoding="utf-8"?> 

2 <feed xmlns="http : //www . w3 . org/2005/Atom"> 

3 <title>CRIM Training Center</title> 

4 <subtitle>The CRIM Training Center provides a highly effective 

learning environment that allows learners to benefit from superior 
quality training . </subtitle> 

5 Clink rel="alternate" type="text/html" href ="http : //www . crim. ca/en/ 

services/Formation"/> 

6 <link rel="self" href ="http : //www . crim. ca/en/services/Formation/Atoml 

.0_Example .xml"/> 

7 <updated>2009-03-02T10 : 12 : 12-05 : 00</updated> 

8 <author> 


* http://www.ietf.org 
1 http://www.ietf.org/rfc/rfc4287.txt 
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9 <name>CRIM</name> 

10 <email>inf oOcrim. ca</email> 

11 </author> 

12 <id>tag :www. crim. ca, 2009 : /en/ services /Format ion</id> 

13 <entry> 

14 <title>0bject-0riented programming with C#</title> 

15 <link href="http : //www. crim. ca/en/services/Formation/Cours- 

inscript ion/ index. html?uri=/en/ services/Formation/Cours- 
ins cript ion/recherche .html&id=WET513en"/> 

16 <id>tag: www . crim. ca, 2009 : /en/services/Format ion/Cours -inscript ion/ 

recherche . html&id=NET513en</id> 

17 <updated>2009-03-02T09 : 05 : 15-05 : 00</updated> 

18 <summary type="html">&lt ;p&gt ; &lt ; i&gt ;C#&lt ; /i&gt ; is the most 

popular language in &lt ; i&gt ;Microsof t . NET&lt ; /i&gt ; . It is a 
pure object oriented (00) language: to develop with C#, we must 
think in terms of objects. Global variables and global functions 
don't exist: everything is a class . Therefore, to develop in C# 
we must use 00 concepts and know how to apply them effectively 
. &lt ; /p&gt ; </summary> 

19 </entry> 

20 <entry> 

21 <title>0verview ISO 20000</title> 

22 Clink href="http : //www . crim. ca/en/services/Formation/Cours- 

ins cript ion/ index. html?uri=/en/ services/Formation/Cours- 
ins cript ion/recherche .html&id=ITI611en"/> 

23 <id>tag: www . crim. ca, 2009 : /en/services/Format ion/Cours -inscript ion/ 

recherche . html&id=ITI61 len</id> 

24 <updated>2009-03-02T10 : 06 : 01-05 :00</updated> 

25 <summary type="html">&lt ;p&gt ; This présentation offers a good 

overview of the only international standard that offers official 
récognition for a business in IT service management . &lt ; /p&gt 
; </summary> 

26 </entry> 

27 </feed> 


Atom’s basic element is feed, which uses an Atom-related XML names- 
pace 5 and contains one or more entry éléments as items. An Atom feed 
must include a title, the updated element (spécifiés the date of the last 
modification), and the author [21, p. 71]. clink rel="alternate" ... /> 
indicates that this document is a feed of the declared website, while clink 
rel="self " ... /> spécifiés the link of this feed document [42, p. 84]. These 
links are also mandatory for the feed element. Entries are described by tags 
of title, link, id, and summary, whereas the first three éléments are manda- 
tory. Whereas RSS uses descriptions to describe the content of an item, Atom 
offers two éléments: summary and content. Especially if the content is non- 
textual or non-local (e.g., identified by a link), the summary is important 
for accessibility reasons. An entry must not hâve more than one summary 
and content element. Like the feed element, entry must also include an 


5 http: / /www.w3.org/2005 /Ator 
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updated element. The type attribute in the summary element of an entry in- 
dicates that this summary includes entity-encoded HTML. Other types are 
"text" (no HTML included), "xhtml," "application/rdf+xml," and external 
type specified by "application/xyz" and its source. This spécification of con- 
tent types is called "Atom content model," and is even more important in 
the content element. 

Atom is extensible in the same way as RSS 2.0, which means that éléments 
can be included if they are defined in an XML namespace. Compared to RSS, 
Atom is more specified, and it is likely to replace RSS in the future, since 
the spécifications of RSS will not be further developed or clarified [21, p. 78]. 


4.2.3 Web Feeds for MLO-AD 

As RSS and Atom feeds can include any content, they can also transfer 
updated information about learning opportunities. The possibility to extend 
these XML documents allows more éléments that can specify properties of a 
learning opportunity or its provider to be added. 

To make web feeds applicable for MLO-AD, it needs to match the required 
properties of MLO-AD objects to existing RSS or Atom tags. Due to the rea- 
son that both feed formats do not offer enough suitable éléments to match ail 
properties, new éléments that use an XML namespace must be defined addi- 
tionally. Fig. 2.1 in Section 2.2 shows the complexity of the MLO-AD model. 
Learning opportunity providers offer learning opportunity spécifications that 
are an abstract description and specify one or more learning opportunity in- 
stances. This shows that the MLO-AD resources stand in relation to each 
other. Moreover, each object spécifiés various properties. 

Web feeds are XML documents with simple spécifications. Atom is the tech- 
nically superior feed format and better qualified for sophisticated require- 
ments like those of MLO-AD [21, p. 179]. Therefore, it is recommended to 
use Atom feeds for collecting data specified by MLO-AD. However, it is also 
a challenge to realize an efficient mapping of the relation between a learning 
spécification and an instance with Atom. Due to the reason that feeds are 
generated by the providers (which are various learning opportunity providers 
in this case), there is no possibility by web feeds to ensure that ail feeds of 
the different providers use the same structure for describing learning oppor- 
tunities based on MLO-AD. Therefore, parsing and "understanding" these 
feeds correctly is a challenge for the aggregator portai based on MLO-AD. 

However, web feeds are commonly used to transfer updated content on the 
Internet. Especially Atom can be considered as a data collection technology 
for MLO-AD, even if the aggregator has to face some difüculties when parsing 
these spécial documents. 
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4.3 Semantic Web 

The Semantic Web is not a separate Web, but an extension of the 
current one, in which information is given well-defined meaning, 
better enabling computers and people to work in coopération, 


says Tim Berners-Lee (the inventor of the World Wide Web), James Hendler, 
and Ora Lassila in the article "The Semantic Web," published in Scientific 
American [7]. 

The current Web is built on HTML and XML, which describes the structure 
of the information presented for humans on a website, and does not include 
any information about the meaning of the displayed data [2, pp. 37-38]. 
Today’s search engines index HTML pages to find answers, but they also 
return a lot of irrelevant information. They just look at occurrences of words 
in documents which is a hint, but does not tell what the document really is 
about [6, pp. 177-178]. The meaning of the web content is not automatically 
processable or "understandable" by computers or applications. "Smart" web 
applications or software agents that solve complex problems can only be as 
intelligent as the data that is available to them [3, p. 3]. 

The Semantic Web provides a knowledge représentation of linked data in or- 
der to allow machine processing. This représentation of knowledge is realized 
through information about information (which is also called "metadata"), as 
well as through connections between different forms of data [6, pp. 181, 185]. 
Web applications and agents can use ail kind of data on the Web by con- 
nections and by using rules to conduct automated reasoning. This added 
logic of the Semantic Web must enable the description of complex properties 
of objects, but must not be too intricate that machines or agents can be 
tricked by being asked to consider a paradox [2, p. 69]. The Semantic Web 
and its knowledge représentation of linked data is more practical than the 
current Web because applications can get the data they need [3, p. 4]. Tim 
Berners-Lee’s vision of the Semantic Web is as described in [7]: 


The real power of the Semantic Web will be realized when peo- 
ple create many programs that collect web content from diverse 
sources, process the information and exchange the results with 
other programs. The effectiveness of such software agents will in- 
crease exponentially as more machine-readable web content and 
automated services (including other agents) become available. 
The Semantic Web promûtes this synergy: even agents that were 
not expressly designed to work together can transfer data among 
themselves when the data corne with semantics. 
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Figure 4.1: Semantic Web Layer Cake (according to Tim Berners-Lee’s 
diagram published on the website of W3C Semantic Web Activity) 


To be able to add logic to the Web, the Semantic Web needs different tech- 
nologies and standards which are organized in layers built one upon another. 
The architecture of the Semantic Web is illustrated by the Semantic Web 
Layer Cake or Semantic Web Stack, which is shown in Fig. 4.1. Building one 
layer upon another requires each layer to be aware of a layer to interpret at 
a lower level and to take at least partial advantage of information at higher 
levels [2, p. 69]. 

The Semantic Web is based on Uniform Resource Identifiers (URI) and Inter- 
nationalized Resource Identifiers (IRI) to define web resources (ail things on 
the Web that can be identified) in a unique way. XML (Extensible Markup 
Language) is responsible for structuring the data of a resource, is extensible 
by arbitrary tags, but says nothing about the meaning of the information 
or structure. XML is an open standard to exchange data between appli- 
cations over the Web, similar to HTML, which allows information to be 
displayed and exchanged over the Internet. It is also the bridge to exchange 
data between the two main web software development frameworks: J2EE 
and .NET [2, p. 69]. XML is often combined with an XML schéma which 
dé f i n és the structure of the XML document and extends it with data types. 
The Resource Description Framework (RDF) is the basic framework of the 
Semantic Web [3, p. 28] and expresses data models which refer to resources 
and their relationships in a formai way to enable software agents to read 
and process them. RDFS (or RDF-S) is the abbreviation for RDF Schéma, 
which provides basic éléments for the description of RDF vocabularies and 
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structures RDF resources. SPARQL stands for SPARQL Protocol And RDF 
Query Language, and is a World Wide Web Consortium (W3C) 6 recom- 
mendation that has been used since January 2008 to query data based on 
RDF [29]. The Web Ontology Language (OWL) belongs to the knowledge 
représentation language family for authoring ontologies in the Web. Ontolo- 
gies are collections of information and define relations among terms [2, p. 55]. 
Rule Interchange Format (RIF) is still being developed 7 by a W3C working 
group, and will bring rules support to the Semantic Web. The upper lay- 
ers of the Semantic Web that are underneath the user interface (which is 
the connection to the user) are trust, proof, and logic, which should prevent 
wrong information or relations as well as spam pages or spam ontologies. 
These layers hâve not been fully realized yet. An organization of documents 
by chains of trust will support the identification of trustful information. 

4.3.1 RDF 

The Resource Description Framework (RDF) is a standard developed by 
the W3C for representing information about resources in the World Wide 
Web [25, Ch. 1], 

Each resource is identified by a URI. RDF enables statements to be rep- 
resented in the form of subject-predicate-object sentences, which are also 
called RDF triples. A triple relates a subject to an object via a predicate, 
while ail three éléments are identified by URIs [35, p. 83]. Fig. 4.2 shows an 
example of a triple, whereas "Learning Opportunity Provider" is the sub- 
ject described by the statement, "Offers" is the property of the subject, and 
"Learning Opportunity" is the value of the statement. This simple model of 
the triple used by RDF has many advantages. One of the most important is 
described in [2, pp. 87-88] in the following way: 

Any data model can be reduced to a common storage format 
based on a triple. This makes RDF idéal for aggregating disparate 
data models, because ail the data from ail models can be treated 
the same. This means that information can be combined from 
many sources and processed as if it came from a single source. 

Multiple triples can be connected to form an RDF graph, whose nodes illus- 
trate URIs of resources and whose arcs are properties [35, p. 84]. Triples of a 
graph can also originate from different sources. This supports the idea of the 
AAA slogan that says: "Anyone can say Anything about Any topic" [3, p. 35], 
meaning that anyone can create a statement about any resource. 

6 http://www.w3.org 

7 http://iit-iti.nrc-cnrc.gc.ca/new- neuf/2008/08- 09- 09_e.html 
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( Learning Opportunity Provider ) Offers — ^Learning Opportunity) 

Subject Predicate Object 


Figure 4.2: Example of an RDF triple 


RDF provides an XML-based syntax (RDF /XML) for recording and ex- 
changing triples or graphs [25, Ch. 1], The following example shows the 
triple of Fig. 4.2 in an RDF/XML syntax: 


1 <?xml version="l . 0"?> 

2 <rdf :RDF xmlns : rdf ="http : //www . w3 . org/1999/02/22-rdf -syntax-ns# M 

3 xmlns :mlo="http : //www . example . org/mlo/"> 

4 <rdf description rdf : about="http : //www . example . org/ 

learningOpportunityPro vider " > 

5 <mlo : off ers>Learning Opportunity</mlo : of f ers> 

6 </rdf :Description> 

7 </rdf :RDF> 


The rdf :RDF element in line 2 (and which ends in line 7) déclarés its content 
as RDF. XML namespace déclarations indicate that ail tags prefixed with 
rdf: are part of the namespace identified by the URI http://www.w3.org/ 
1999/02/22- rdf-syntax- ns# [25, Sec. 3.1], and tags using the prefix mlo: are 
MLO-AD éléments. Lines 4-6 provide the RDF/XML for the spécifie state- 
ment, which is represented as a Description that is about a subject (in this 
case, about http://www.example.org/learningOpportunityProvider). The con- 
tent of rdf description éléments are called property éléments and may 
contain other descriptions producing nested descriptions [2, p. 91]. This 
example includes just one property, which is mlo: offers with the value 
Learning Opportunity described as a plain literal. An RDF /XML file can 
cover statements about multiple subjects, whereas each subject is described 
by an rdf description element. 

An RDF document can refer to an RDF schéma (RDFS), which provides the 
facilities to define vocabularies and to indicate spécifie classes and properties 
of resources, as well as the way they must be used together [9]. Consequently, 
the structure of the RDF document is specified by the RDFS. 

RDF data can be queried and accessed via SPARQL. This referenced stan- 
dard consists of three spécifications: a description of the language to query 
RDF data across diverse data sources [29], a protocol which is described 
with WSDL 2.0 as well as by HTTP and SOAP bindings to query remote 
databases [11], and the XML format of the query resuit which will be re- 
turned [5]. 
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4.3.2 OWL 

The Web Ontology Language (OWL) for knowledge représentation has been 
developed and standardized by the W3C and is described in [26] as follows: 

OWL is designed for use by applications that need to process the 
content of information instead of just presenting information to 
humans. [...] OWL can be used to explicitly represent the mean- 
ing of terms in vocabularies and the relationships between those 
terms. This représentation of terms and their interrelationships is 
called an ontology. OWL has more facilities for expressing mean- 
ing and semantics than XML, RDF, and RDF-S, and thus OWL 
goes beyond these languages in its ability to represent machine 
interprétable content on the Web. 

OWL provides three sub-languages for different levels of expressiveness and 
efficient reasoning: OWL Lite, OWL DL, and OWL Full. OWL Full is like an 
extension of RDF, while OWL Lite and OWL DL (Description Logics) are 
like extensions of a restricted RDF [26, Sec. 1.3]. OWL-DL is the most promi- 
nent language of the OWL family and is most supported by the Semantic 
Web community [35, p. 88]. 

OWL enhances RDF (which also provides some spécifications for ontologies) 
inter alia with more vocabulary for describing properties, classes, as well as 
relations (e.g., disjointedness), cardinality, equality, and emi m erated classes 
[2, p. 107]. The basis of OWL is formed by classes, relations between them, 
properties of classes, and constraints on relations and properties [2, p. 111]. 

The following code is a simple example of OWL in the RDF /XML syntax. It 
maps a part of the MLO-AD model shown in Fig. 2.1 of Section 2.2. The class 
"LoProvider" (Learning Opportunity Provider) offers a "LoSpecification" 
(Learning Opportunity Spécification) class, whereby both classes are sub- 
classes of class "LoObject" (Learning Opportunity Object). 


1 <?xml version="l . 0" encoding="UTF-8"?> 

2 <rdf :RDF 

3 xmlns :rdf ="http : //www. w3 . org/1999/02/22-rdf-syntax-ns#" 

4 xmlns : rdf s="http : //www . w3 . org/200/01/rdf -schéma#" 

5 xmlns :xsd="http : //www. w3 . org/2001/XMLSchema#" 

6 xmlns : owl="http : //www. w3 . org/2002/07/owl#" 

7 xmlns="http: //www . example . com/mlo-ad. owl#"> 

9 <owl : Ontology rdf : about="http : //www . example . org/mlo-ad"> 

10 <owl : versionlnf o>Example of ML0-AD</owl : versionlnf o> 

11 </owl : 0ntology> 

12 

13 <owl:Class rdf : ID="LoObject"> 
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14 <rdfs :Label>Learning Opportunity Object</rdfs :Label> 

15 <rdf s : comment>Abstract resource used for learning opportunities . </ 

rdf s : comment > 

16 </owl:Class> 

17 

18 <owl:Class rdf : ID="LoProvider"> 

19 <rdfs :Label>Learning Opportunity Provider</rdf s : Label> 

20 <rdf s : subClassOf rdf : resource="#loOb ject "/> 

21 </owl:Class> 

22 

23 <owl:Class rdf : ID="LoSpecif ication"> 

24 <rdfs :Label>Learning Opportunity Specif ication</rdf s : Label> 

25 <rdf s : subClassOf rdf : resource="#loOb ject "/> 

26 </owl:Class> 

27 

28 <owl : ObjectProperty rdf : ID="offers"> 

29 <rdfs:domain rdf :resource="#loProvider"/> 

30 <rdf s : range rdf : resource="#loSpecif ication"/> 

31 </owl : ObjectProperty> 

32 

33 <owl :DataTypeProperty rdf : ID="location"> 

34 <rdfs:domain rdf :resource="#loProvider"/> 

35 <rdf s : range rdf : resource="&xsd; string"/> 

36 </owl :DataTypeProperty> 

37 

38 <LoProvider rdf : ID="CRIM"/> 

39 </rdf :RDF> 


The header of the OWL document includes rdf : RDF as the root element 
and spécifiés a number of namespaces [2, p. 112]. The header also contains 
the owl:0ntology block for describing the current ontology. This block can 
include import statements by using owl: imports [26, Sec. 2.1]. Much of the 
power of ontologies cornes from class-based reasoning, which is supported 
in OWL by the owl:Class éléments [2, p. 113]. Classes are sets contain- 
ing members that are also called individuals. Line 38 defines the individual 
"CRIM" for the class "LoProvider" specified in line 18. rdf s : subClassOf is 
a fundamental constructor for classes and relates a spécifie class to a more 
general one [22, p. 102]. These relations are shown in line 20 and 25. Proper- 
ties are separated into datatype properties, which define relations with RDF 
literals or XML schéma vocabularies (lines 33-36), and object properties, 
which relate instances of two classes (lines 28-31) [22, p. 103]. 

OWL provides many more constructs to specify classes, properties, their 
relations, and characteristics, as well as to define cardinality, equality, or 
restrictions. Ail constructs are precisely defined on the reference website 8 . 


'http://www.w3.org/TR/owl-ref 
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4.3.3 Semantic Web for MLO-AD 

The Semantic Web provides different layers of technologies which can be used 
to describe and collect information about learning opportunities. Resources 
of MLO-AD can be described very well via RDF or OWL, since describing 
data models and their relations is the main task of these spécifications. RDF 
and OWL also allow the inclusion of XML or RDF schémas that define 
the structure of a document to be able to describe a learning opportunity 
according to MLO-AD. Both spécifications are based on XML, which is the 
standard to exchange data between applications. 

Compared to OWL, RDF and RDFS hâve some disadvantages: It is not 
possible to define properties of properties, conditions for class membership, 
or équivalence and disjointness of classes [2, p. 102]. Additionally, "RDF is 
roughly limited to binary ground predicates and RDF Schéma is roughly 
limited to a sub-class hierarchy and a property hierarchy with domain and 
range définitions," according to [2, p. 107]. However, OWL is a strong, but 
also complex language. RDF and RDFS are too simple for describing complex 
ontologies for the Semantic Web, but they are su ffi ci eut to describe the data 
model of MLO-AD. 

To be able to collect information about learning opportunities via RDF, the 
provider of the educational offer has to create and publish this document on 
the Web. The aggregator checks this document for modifications at regular 
intervals. Unlike the Atom web feed, RDF or OWL do not specify an updated 
construct to identify a change of the document. To avoid the unnecessary 
transfer of data, it needs to specify an indicator (e.g., timestamp) to know 
if the document and its content were updated. 

To be exact, RDF and OWL are both spécifications to describe resources 
and are no standards for data transfer over the Web. However, RDF can 
be queried remotely by using the SPARQL protocol which is based on a 
web service technology. Another possibility to transfer RDF documents is 
via RDF feeds, which are the second development of RSS feeds specified as 
RSS 1.0 (RDF Site Summary) 9 . RDF feeds are not widely used, but examples 
do exist, as the website of Unified Data Feed 10 shows. 

A drawback of RDF is the complex spécification and, consequently, the com- 
plicated création of RDF documents (OWL is even more complex than RDF). 
This could be a barrier for learning opportunity providers to describe their 
MLO-AD information by RDF. 

9 http://web.resource.org/rss/1.0/spec 
10 http://web2express.org/ufeed 
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4.4 Web Scraping 

Web scraping is a technique to extract information from web pages, and is 
consequently also called web data extraction. It is often equalized with screen 
scraping and is closely related to web indexing. Most search engines use web 
robots or web spiders (which are a spécial type of a web robot) for web index- 
ing and categorizing the content, which allows a faster provision of search 
results [41, pp. 104-110]. A web robot (also called "bot" or "crawler") is 
described in [19, p. 1] as "an Internet-aware program that can retrieve infor- 
mation from spécifie locations on the Internet." Compared to web indexing, 
web scraping is more focused on the transformation of unstructured content 
into structured data. Especially aggregators which collect information to of- 
fer price comparison or current weather statistics use web scraping to harvest 
their required data (see Section 2.4). 

Before extracting data from a web page, a bot has to fetch the latest instance 
of the page from its respective URL and then parses the web page [8, p. 568]. 
According to the task of the bot, it looks for spécifie keywords and processes 
this data, transforms web content into structured data, or constructs and 
analyzes the HTML tree of a web page. This might seem to be a trivial 
process at first glance, but web scraping can also face some difficulties, as 
described in [28, p. 635]: 

A comprehensive data extraction process must deal with such ob- 
stacles as session identifiers, HTML forms, client-side JavaScript, 
incompatible datasets and vocabularies, and missing and conflict- 
ing data. Proper data extraction also requires solid data valida- 
tion and error recovery to handle data extraction failures. 

Several types of software for web scraping, which also support various fea- 
tures (e.g., transformation into XML, création of web feeds, DOM parsing) 
to harvest data from web pages, already exist 11 . 

The process of web scraping, especially of complex data, can be facilitated 
for bots if the web page is enhanced with semantic information. Hereby, the 
web page is still based on HTML or XHTML (the "real" Semantic Web 
is based on RDF), but semantics are added by so-called "inline metadata 
formats." The two main technologies for this inline metadata are RDFa and 
microformats. Also HTML 5 12 provides more semantic information through 
new HTML éléments, but it is not fully supported by ail common browsers 
yet. 

11 http://en.wikipedia.org/wiki/Web-scraping_software_comparison 

12 http://www.w3.org/TR/html5 
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4.4.1 RD Fa 

RDFa is a W3C recommendation and allows human-readable data to be 
marked up with machine-readable indicators with a few simple XHTML at- 
tributes [1, Ch. 1]. Therefore, browsers and other agents can interpret and use 
this metadata. RDFa is only available for XHTML but not for HTML, since 
the latter is not extensible. The extension of XHTML results from reusing 
attributes from XHTML meta and link éléments and applying them to 
other XHTML éléments. Accordingly, this allows an annotation of XHTML 
markup with semantic information [15, p. 2]. RDFa uses and requires a new 
form of URIs called CURIE (compact URI). 


1 <div xmlns : dc="http : //purl . org/dc/elements/1 . l/"> 

2 <div about="/alice/posts/trouble_with_bob"> 

3 <h2 property="dc :title">Learning Opportunity Provider</h2> 

4 <p property="dc : description">Learning Opportunity Providers provide 

ail kind of educational events . They include educational 
institutions but also instructors or experts.</p> 

5 </div> 

6 </div> 


The underlying abstract représentation of RDFa is RDF [1, Ch. 4]. This is 
shown by the about and property attributes in the example above. There- 
fore, it is possible to extract RDF triples from an RDFa annotated web page 
by a simple mapping [15, p. 2]. RDFa represents RDF structure with pure 
XHTML, can be built with any vocabularies, and is extensible. This allows 
new spécifications for describing content of web pages. However, it does not 
ensure that content of the same area is described with the same spécification. 

Up to now, RDFa has been less implemented than microformats. 


4.4.2 Microformats 

Microformats hâve been developed by the people behind microformats.org 13 
and are defined on their website as follows: 

Designed for humans first and machines second, microformats 
are a set of simple, open data formats built upon existing and 
widely adopted standards. Instead of throwing away what works 
today, microformats intend to solve simpler problems first by 
adapting to current behaviors and usage patterns (e.g., XHTML, 
blogging). 


s http://microformats.org 
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This set of open data formats includes inter alia the rel and rev attributes 
that describe relationships of documents connected by links and their reverse 
links, XFN for describing relationships between people, hCalendar, hCard, 
and hAtom for content published as a web feed [4, Ch. 4-10]. 

Unlike RDFa, the technology of microformats can be integrated into both 
XHTML and HTML, and also works perfectly with CSS [15, p. 4], It pro- 
vides a compact syntax based on existing HTML, and is easy to implement. 
The following code shows an example using the hCalender microformat to 
describe an event. 


1 <div id="hcalendar-event" class="vevent"> 

2 <abbr title="2009-07-01" class="dtstart">July lth 2009</abbr>, 

3 <abbr title="2009-07-03" class="dtend">July 3rd 2009</abbr> 

4 <span class="smranary">Learning Opportunity</span> 

5 <div class="description">This Learning Opportunity is specified by its 

start and end date.</div> 

6 </div> 


A drawback of microformats is the limited number of spécifications, as well 
as the difficult invention of new microformats. In addition, each microformat 
requires a separate parsing rule [15, p. 4]. 


4.4.3 Web Scraping for MLO-AD 

Web scraping is a popular technique for retrieving different kinds of infor- 
mation on the Web. It can also be used to collect information about learning 
opportunities described by MLO-AD. However, to be able to extract data 
specified by the complex MLO-AD data model, it needs to provide seman- 
tics through inline metadata formats to identify the various properties of a 
learning opportunity and its provider. 

Currently, microformats are more widely used and easier to implement than 
RDFa. The microformats technology is comprised of a limited set of existing 
spécifications, whereas some of them can be used to describe properties of 
MLO-AD objects. The hCalendar microformat, which is a calendaring and 
events format, would be best applicable to describe educational events. It 
covers properties for the date, duration, location, category, and description, 
which are needed by a learning opportunity instance. The hCard microfor- 
mat is a format to represent people, companies, and organizations. It pro- 
vides properties that can be used to describe a provider of an educational 
event. However, the MLO-AD data model includes more properties than 
those provided by microformats, which are, for example, crédit, qualifica- 
tion, prerequisite, and engagement. These properties cannot be described by 
existing spécifications of microformats. Due to the fact that the specifica- 
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tion of a new microformat is a difficult process, microformats are not well 
applicable as an inline metadata format for learning opportunities specified 
by MLO-AD. 

RDFa can only be used on XHTML web pages, but is easily extensible with 
new vocabularies. The extension through new concepts or properties results 
from the indication (by using simply a URL) of a directory, which is saved 
somewhere on the Web and from where the spécifie new properties are im- 
ported [1, Ch. 2]. This means that to include MLO-AD properties on a 
document annotated by RDFa, these vocabularies need to be specified and 
published on the Internet, so that references to these MLO-AD properties 
can be created by including this namespace. However, the provision of these 
vocabularies does not ensure that required information is included for the 
comparison of learning opportunities, or that the web page meets a spécifie 
structure for an easy and valid parsing of the data. 

Web scraping of data described by MLO-AD is possible if the web page 
includes annotations with RDFa which identify the spécifie properties of 
learning opportunities and their providers. However, harvesters using the 
web scraping technique still hâve to face difficulties like missing or conflicting 
data or different structures of web pages describing learning opportunities. 


4.5 Web Services 

The World Wide Web Consortium defines web services as follows [16, Ch. 2]: 

A web service is a software System designed to support interop- 
erable machine-to-machine interaction over a network. 

Web services hâve an interface to access application functionality by using 
the HTTP protocol. A simple web service workflow consists of a request by 
a client (web service requester) to the server (web service provider), and 
the subséquent response from the server. The form, as well as the under- 
lying protocols of the request and the response, dépend on the type of the 
web service. The most common types of web services are SOAP-based and 
RE S T fui 

4.5.1 SOAP-based Web Services 

SOAP-based web services (also called "big web services") use the Simple 
Object Access Protocol (SOAP) as an interoperable messaging format. A 
SOAP message is an XML document that follows the SOAP standard by 
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including three éléments: an envelope, a header, and a body [40, Sec. 3.3.1]. 
SOAP is an envelope which contains any XML data, and is itself wrapped 
by the HTTP request or response. 

SOAP-based web services usually use an architecture based on Remote Pro- 
cedure Calls (RPC) [33, p. 19]. This means that to access a spécifie function- 
ality of the server, the web service requester calls a function "remotely" on 
the server via the web service. SOAP-based web services hâve no limitation 
for defining functions, which is a main différence to RESTful web services. 

The interface of the web service is described in a machine-processable format 
called Web Services Description Language (WSDL) [16, Ch. 2]. With WSDL, 
clients can find out how to use the spécifie web service and which functions 
can be called. 


4.5.2 RESTful Web Services 

REpresentational State Transfer (RESTful) web services hâve an architec- 
ture similar to that of the Internet, where each resource (web page in the 
web architecture) is uniquely addressable [33, p. 13]. Therefore, RESTful 
web services are resource-oriented, and not oriented to functions like RPC 
used by SOAP-based web services. Resources are identified by URIs. 

In REST requests, the method information goes into the HTTP method and 
the resource information is the URI sent with the request [33, p. 13]. Due to 
the fact that HTTP methods are used, the following methods are the most 
important ones available for RESTful web services: 

• GET retrieves or reads the resource with the given URI 

• POST créâtes the resource with the given URI 

• PUT updates the resource with the given URI 

• DELETE deletes the resource with the given URI 

RESTful web services support different représentation formats. The most 
common formats are simple XML and JSON, but XHTML, RDF, or Atom 
can also be used [33, pp. 259-272], 

Compared to SOAP-based web services, RESTful web services hâve gained 
more popularity because of their easier implémentation. 
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4.5.3 Web Services for MLO-AD 

Web services can be used for the communication between the data repos- 
itory which holds information about learning opportunities and the aggre- 
gator portai based on MLO-AD. For this purpose, the learning opportunity 
provider must implement a web service which the aggregator portai can use 
to retrieve data about the provider’s educational events. This implémentation 
requires additional effort from the learning opportunity provider. However, 
the web service can also be used to provide other Systems with information 
about learning opportunities. 

Due to the fact that the communication between the aggregator portai and 
the provider is relatively simple, the web service does not need to consist 
of a complex implémentation as SOAP-based web services would provide. 
RESTful web services are well applicable for retrieving information about 
learning opportunities and their providers. The aggregator portai uses the 
GET function with the URI of the spécifie learning opportunity object or 
of a collection holding more educational events and créâtes a new object 
or updates an existing one in its data repository. The RESTful web service 
of a learning opportunity provider can return various data types describing 
the educational event based on MLO-AD. Therefore, the aggregator portai 
must be able to parse responses of different types from the various learning 
opportunity providers. Regarding the structure of the MLO-AD data model, 
web services will usually return MLO-AD data in XML or RDF. 

To keep the aggregator’s data repository up-to-date, the aggregator must 
send the web service requests with the spécifie URIs at regular intervals. 
Unfortunately, the aggregator does not know if an update is actually neces- 
sary before sending the request. Even if the resource includes a timestamp 
of the last update, the aggregator portai can compare the dates just af- 
ter receiving the response from the web service provider. This means that 
unnecessary data transfer can occur when using web services for collecting 
MLO-AD data. 


4.6 OAI-PMH 

The Open Archives Initiative Protocol for Metadata Harvesting (OAI-PMH) 
is a protocol of the Open Archives Initiative 14 and provides an application- 
independent framework based on metadata harvesting [24, Ch. 1]. The pro- 
tocol is based on HTTP and is used to make a digital repository’s metadata 
available for harvest [32, p. 161]. 


1 http://www.openarchives.org 
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Service Data 

Provider Provider 

HTTP Request 



HTTP Response 

Figure 4.3: OAI-PMH Service Provider and Data Provider Architecture 
(according to [12, p. 71]) 


The architecture of OAI-PMH for harvesting information is build on two 
components: OAI data providers hold collections of content described by 
metadata, while OAI service providers harvest this content [12, p. 6]. Both 
components need an OAI-PMH interface to be able to communicate via the 
OAI-PMH protocol. Fig. 4.3 shows the architecture of OAI-PMH. Reposito- 
ries of data providers are most commonly based on XML files, SQL databases, 
or Content Management Systems. With the Repository Explorer 15 , data 
providers can test their archives for compliance with OAI-PMH. 

Ail data providers must disseminate their metadata items in simple Dublin 
Core, but can additionally support other metadata formats [24, Sec. 3.4]. 
Dublin Core is a set of metadata éléments that describe networked resources 
[12, p. 33]. It was first released 1998 by the Dublin Core Metadata Initiative 
(DCMI) 16 . 

4.6.1 OAI-PMH HTTP Requests and Responses 

Requests for harvesting metadata via OAI-PMH are HTTP GET or POST 
requests. OAI-PMH defines six verbs to harvest metadata, whereas one, and 
only one, has to be included in the HTTP request. The finite list of verbs 
covers the following éléments, as described in [32, pp. 162-168]: 

• GetRecord: This verb returns an individual metadata record from the 
data repository. A metadata record covers ail the information about 

15 http: / /re.cs.uct.ac.za 
10 http://dublincore.org 
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a given resource and is uniquely identified by its identifier, metadata 
prefix and datestamp. The GetRecord request requires the use of the 
identifier and the metadataPrefix arguments. The latter spécifiés the 
metadata schéma of the OAI-PMH response. 

• Identify: This verb is used to receive information about the repository, 
such as name, base URL, administrator’s e-mail address, and version 
of OAI-PMH. 

• Listldentifiers : This verb retri eves the identifiers of a set of items from 
the data provider. This request must include a metadataPrefix and can 
contain arguments for a date range or a limit by set. 

• ListMetadataFormats : This verb returns the metadata schémas sup- 
ported by the OAI-PMH repository. 

• ListRecords : This verb is used to retrieve a list of full metadata records. 
The metadataPrefix argument is required to receive a spécifie metadata 
preference. More arguments can be added to limit records by date or 
by set. 

• ListSets: This verb returns information about the current list sets reg- 
istered on an OAI-PMH server. 

Responses of OAI-PMH requests are valid XML files conforming to the XML 
schéma of OAI-PMH 17 . The workflow of OAI-PMH based on HTTP requests 
and responses is also displayed in Fig. 4.3. 


4.6.2 OAI-PMH for MLO-AD 

The main requirement of OAI-PMH in regard to its data providers is the 
dissémination of Dublin Core as the metadata format. Besides spécifie prop- 
erties about learning opportunities, MLO-AD also includes the éléments of 
Dublin Core in its "Learning Opportunity Object" class. Therefore, MLO- 
AD can fulfill this requirement, and is applicable for harvesting via OAI- 
PMH. 

To be able to support MLO-AD, the data provider must include this meta- 
data format via an XML schéma, and by using a new namespace. This XML 
schéma defines the spécifie éléments and the structure of a metadata record 
about a learning opportunity resource. 

An OAI-PMH harvester can only harvest metadata from a repository if this 
supports OAI-PMH via an interface. Therefore, providers of learning oppor- 
tunities must implement a service to be able to accept OAI-PMH requests 
17 http://www.openarchives.org/OAI/2.0/OAI-PMH.xsd 
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and return a valid XML file containing the required information about a 
learning opportunity or its provider. This implémentation is an additional 
investment for an educational institution. However, with the provision of an 
OAI-PMH interface, the provider enables an easy and standardized way for 
harvesters to retrieve information about learning opportunities and to be 
able to advertise these educational events. 

A harvester for MLO-AD resources can use the ListMetadataFormats verb in 
an HTTP request to ensure that the data repository supports the MLO-AD 
metadata format. After that, the harvester sends an HTTP request with the 
ListRecords verb to retrieve a list of full metadata records. The metadataPre- 
fix argument must be added to the request, which spécifiés the MLO-AD as 
a preferenced metadata format. The optional arguments "from" and "until" 
limit the harvested list of metadata records by comparing the date stamp 
of the last modification of each metadata record with the argument values. 
This means that only updated or added learning opportunity resources from 
the last harvesting process are collected if the "from" argument with the 
spécifie date is added to the request. 


4.7 Comparison of Solutions for MLO-AD 

Every technology for collecting data has to be adjusted to the spécification 
of the MLO-AD data model. For each technology, the learning opportunity 
provider has to provide a spécial implémentation or mapping of MLO-AD 
data to enable data to be collected from its repository. This could be a cré- 
ation and publication of Atom feeds, the mapping of MLO-AD information to 
the RDF spécification, as well as enabling a remote query of these documents, 
an extension of the web pages containing the information about learning op- 
portunities with RDFa annotations, the création of RESTful web services or 
the implémentation of an OAI-PMH interface. In the future, the aggregator 
portai based on MLO-AD will support ail possible solutions for data collec- 
tions to address a wide range of learning opportunity providers. However, 
the first version of the aggregator portai must support a way of data collec- 
tion that is appealing to many educational institutions. This means that the 
investments to be able to support this way of data collection, as well as its 
administration, must be easy and cheap. Additionally, the chosen technology 
should also provide a secure and stable solution for the aggregator portai, so 
that it can parse the collected data easily and correctly. Proper data collec- 
tion by the aggregator portai ensures a good provision of information about 
learning opportunities and an easy comparison of those. 

RDF is a strong but complex spécification to describe resources. However, it 
has difficultés establishing itself on the Web due to its complex realization. 
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Learning opportunity providers will prefer to describe their educational offers 
via XML documents including the MLO-AD spécification, which are easier to 
create. The use of XML is very popular in the Web, so it can be assumed that 
the création of such documents is well-known. Additionally, XML documents 
can include an XML schéma which spécifiés the structure of the document. 
An XML schéma specified for describing resources with MLO-AD can be 
published on the website of the aggregator portai, so that the educational 
institutions can include it in their XML documents. Of course, it cannot 
be ensured that learning opportunity providers actually use the schéma. 
However, if the XML schéma is included, the parsing of the XML document 
is facilitated for the aggregator portai. 

Alternatively, learning opportunities could also be described directly on a 
web page, but must include RDFa annotations to enable web scraping for 
the aggregator portai. It is easier to include RDFa in a website than to create 
an RDF document to describe resources based on MLO-AD. Although RDFa 
is less well-known than XML, it is gaining more and more popularity due to 
its easy implémentation. However, it is not possible to define a structure of 
the provided content on the web page, which complicates a proper mapping 
of data to the properties of MLO-AD. Additionally, like XML and RDF, 
RDFa is not a technology to transmit information over the Web, but it is a 
good and easy solution to display and save data about learning opportunities 
based on MLO-AD. 

Web services are a popular technology to communicate between web applica- 
tions and to transfer information described by XML. RESTful web services 
are especially easy to create and use. However, the web service solution has 
the drawback of unnecessary data transfer, as it does not allow the last 
modification date of a resource to be checked before it is received. 

By using Atom feeds for collecting data about learning opportunities, the 
aggregator portai can check whether this data transfer is actually necessary 
before the content is downloaded. Feeds are widely used over the Web and 
easy to create, especially as many tools which facilitate the création and 
publication of feeds already exist. Unfortunately, feeds hâve the same disad- 
vantage as annotations of RDFa: Web feeds do not include a définition of 
the content structure which describes the learning opportunity. Because of 
this, the parsing of the XML content can be error-prone. However, due to 
Atom’s récognition and popularity, this technology is still a good solution 
for collecting data from learning opportunity providers. 

The technology via OAI-PMH returns XML files that can include an XML 
schéma for describing the structure of the document. It is also possible to 
avoid unnecessary data transfer due to the valid request parameters which 
allow a restriction of the retrieved content according to the date of last 
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modification. However, OAI-PMH also requires a spécial implémentation of 
an OAI-PMH interface to accept HTTP requests based on this protocol. A 
downside of this technology in connection with the implémentation is the fact 
that OAI-PMH is not well-known and institutions may be reluctant to use it. 
However, a solution based on OAI-PMH is easy for the learning opportunity 
provider to manage and is also secure when harvesting and parsing for the 
aggregator portai. 


Recommended Solution 

The recommended solution for MLO-AD is developed by using OAI-PMH, 
because of its stability. To enable data collection via OAI-PMH, the aggre- 
gator portai and ail learning opportunity providers hâve to implement an 
interface which can understand this protocol. The OAI-PMH interface of 
the educational institution dépends on the System of their data repository, 
as the interface has to create the XML document including the informa- 
tion from the repository. If the System of the data repository is known and 
commonly used, an OAI-PMH interface can be provided to enable easy and 
automatic data provision for the learning opportunity provider. After the 
installation of this component, the educational institution can easily accept 
HTTP requests based on OAI-PMH and send a valid XML file as a response 
that includes the information about learning opportunities. 

The XML file containing MLO-AD information is enriched by a namespace 
of this standard that guarantees the use of valid éléments. Additionally, it 
includes an XML schéma ensuring that the structure of the content matches 
the data model of MLO-AD. The schéma also spécifiés the occurrences of 
each element and can define default values. Therefore, the XML schéma 
guarantees that harvested MLO-AD content compiles with the requirements 
of the aggregator portai. 

Course management Systems like Moodle 18 and Sakai 19 are widely used by 
educational institutions, and usually cover ail the information about learning 
opportunities provided by the institution. An OAI-PMH interface which can 
communicate with a Course Management System is definitely appealing to 
learning opportunity providers who want to provide their information for 
the aggregator portai based on MLO-AD. Therefore, the first solution of 
MLO-AD data collection will additionally provide an OAI-PMH interface 
that can be integrated with the Course Management System Sakai. Learning 
opportunity providers can download this component from the website of the 
aggregator portai and install it in their Sakai System. 

18 http://moodle.org 

19 http://sakaiproject.org 
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Figure 4.4: System architecture of MLO-AD harvester 


Fig. 4.4 shows the architecture of the MLO-AD data collection based on OAI- 
PMH. The aggregator portai is developed by using the Google App Engine 20 , 
which enables an easy build of web applications on a scalable System. The 
Google App Engine supports Java as its development language. 


1 http://code.google.com/appengine 


Chapter 5 

User Interface 


The aggregator portai based on MLO-AD provides two different kinds of 
user interfaces: The web portai is a web page that is accessible via a public 
URL, while the social networking gadget provides intégration with a social 
network service. Both types use the same logical System as back-ends and 
hâve access to the same data about learning opportunities. The interfaces 
differ in the possibilities of interaction, as well as the arrangement of web 
éléments and information. They are adjusted to the user expérience and the 
applicability of web pages and social network services. 


5.1 Web Portai 

The web page of the aggregator portai allows prospective learners to browse 
and search for learning opportunities. Additionally, it provides general in- 
formation and services about the System. It is the main reference to the 
aggregator portai based on MLO-AD. 

Besides the data about learning opportunities, the web portai offers infor- 
mation for learning opportunity providers to enable them to advertise their 
educational events via the aggregator portai. Educational institutions and 
instructors can register and sign in to get to know the technologies that 
the portai supports for collecting data based on MLO-AD. Explanations 
and tutorials help learning opportunity providers to integrate MLO-AD into 
their System and to implement their chosen technology to enable data col- 
lection. To facilitate the implémentation, the portai also provides certain 
data repository tools and interfaces, which can be downloaded from the web 
page. Educational institutions can also find a form to submit or manage their 
information manually. 
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Figure 5.1: Wireframe of the web portal’s main page 


Although the web portai offers much information and interaction, the inter- 
face has to remain simple to provide easy orientation. Therefore, the main 
page of the portai displays search fields in the center of the page, and cat- 
egories for browsing through learning opportunities on the left side. The 
arrangement of éléments for the main page of the web portai is shown as a 
wireframe in Fig. 5.1. Wireframes are commonly used in interface design to 
provide a visual guide to the structure of an interface and the relationships 
between its pages before the design is developed [10, p. 91]. 

5.1.1 Searching for Learning Opportunities 

The search area of the web portai includes an input field for the subject 
or title of the learning opportunity and a drop-down menu for choosing a 
location. The sélection of the location is optional, but can also consist of 
multiple values because of its multi-select functionality. 

To be able to add more specified information about a learning opportunity 
to a search query, users can switch to another page by clicking on "Advanced 
Search." This view includes input fields for properties like start date (sup- 
ported by a calendar to pick a date), qualification, cost, duration, crédits, 
language of instruction, engagement, prerequisite, assessment, and again lo- 
cation. After inserting values for one or more of these properties, the portai 
créâtes a combined query which is set into the main input field of the search 
mask. Advanced users can directly input such a combined query without 
changing the view to the advanced search mode. The following line shows an 
example of a combined query for the MLO-AD web portai. 

1 java programming location:montreal cost:<500 prerequisite :none 
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This feature allows a functional refinement of a search for both common and 
advanced users, and keeps the main page of the search screen simple and 
clear. 


5.1.2 Display of Search Results 

By default, the results of a search request are grouped by the educational in- 
stitution or instructor, who is the learning opportunity provider. To save the 
prospective learner from a flood of information, the survey of the results only 
shows the name and location of each provider with the name of a learning 
opportunity, as well as the starting dates of its sessions. The names of the 
institution and its offers act as a link to the source of this information (usu- 
ally the web page of the institution) . If the search request returns more than 
three educational events per provider, the number of founded offers is shown 
instead of their names. With a click on this number, ail learning opportuni- 
tés of the institution are displayed. The general view of results works with 
disclosure panels, which can be opened and closed to show and hide further 
information about a learning opportunity or its provider. Fig. 5.2 illustrâtes 
a wireframe with search results in the general display mode. 

To respond to different needs of users and to facilitate their search, the 
results can be sorted and grouped by different properties. The button "Sort 
By" opens a menu that includes program type, qualification, cost, starting 
date, location, duration, crédit, and engagement. The results are displayed 
in the same way as in the general display mode, except that the "header" of 
a resuit (which is the learning opportunity provider in the default view) is 
replaced with the spécifie property. 

Certain properties can be better compared by a design that is different from 
the display of simple text. Therefore, the web portai allows its view of search 
results to be switched to a map or a calendar. A map similar to a Google Map 
gives a good survey of the location of learning opportunités. Pins mark an 
educational event on the map, and show further information when the mouse 
pointer is moved over them. On the contrary, the display of search results 
on a calendar facilitâtes the search and comparison of events for prospective 
learners with a spécifie time frame. Colored fields indicate sessions of learning 
opportunités. By integrating the duration and engagement (spécifiés the 
pattern of attendance such as evenings, or daytime), the calendar can show 
the complété time of each educational event that the learner has to expend 
on it. Different colors of fields identify different learning opportunités. Some 
of them can also be held on multiple dates. The calendar display mode can 
show multiple sessions of an educational event by fields with the same color, 
but different intensity. 
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Figure 5.2: Wireframe showing the general display mode of search results 


5.1.3 Browsing through Learning Opportunities 

Users can discover learning opportunities by browsing through those avail- 
able on the aggregator portai based on MLO-AD. The main page of the web 
portai displays categories of educational events (see Fig. 5.1). Many compre- 
hend sub-categories, which are shown underneath their parent after they are 
clicked upon. Learning opportunities are displayed in the same way as the 
results of a search request. It is also possible to change the view according 
to the location or the date, and to sort the events by different properties. 

In addition, the web portai allows the browsing results to be refined exactly 
as when refining a search. Before clicking on a link of a category, a prospec- 
tive learner can choose one or more locations or enter keywords into the 
main input field. Hence, the user can specify requirements, but still discover 
educational events that could not be found by a search request. 

The area that comprises the categories, as well as each sub-category, includes 
an additional link called "Upcoming LOs." This feature allows irresolute 
learners to discover learning opportunities (that belong to one spécifie or to 
ail kind of categories) starting within the following days. 

5.1.4 Comparison of Learning Opportunities 

Besides the different modes to display search and browsing results, the web 
portai provides another feature for a better comparison of learning opportu- 
nities. Users can find a checkbox next to each resuit to add an educational 
event to a comparison matrix. After the sélection and the click on the but- 
ton "Create LO Matrix", another page for choosing MLO-AD properties that 
should be included in the matrix is displayed. The created matrix shows the 
chosen learning opportunities as rows, and the properties of MLO-AD as 
columns, and provides a good survey of the events the user is interested in. 
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Figure 5.3: Wireframe of the social networking gadget 


5.2 Social Networking Gadget 

Social network services enable new functionalities for gadgets due to the 
different user behavior in these Systems. Social networking means communi- 
cating with friends or people who share the same interests. Popular gadgets 
enable this communication, and are integrated in common actions within 
social network services. 

The social networking gadget of the aggregator portai based on MLO- AD is a 
miniature of the web portai. Therefore, it offers only a limited functionality of 
that, but it is adjusted to the user expérience of social network environments. 

The gadget provides the search functionality for learning opportunities in 
nearly the same way as the web portai. In a social networking environment, 
only the main input field is displayed to allow a simple or combined search 
query to be entered. The advanced search is still enabled, but without provid- 
ing a different view to input properties in spécifie fields. Instead, the gadget 
offers a "Help" button which opens a small note that explains the création 
of a combined search query. Results of a search are shown like results in the 
web portai (by default, in the general display mode). They can also be sorted 
and grouped by different properties. The gadget does not provide a display 
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of search results on a map or calendar. Browsing learning opportunities or 
comparing results within a matrix is not supported. However, a link to the 
web portai is always visible, so that users can easily access ail the functional- 
ity of the portai. Fig. 5.3 shows the structure of the social networking gadget 
with a wireframe. 

From the technical point of view, gadgets are XML files that include HTML 
for markup, JavaScript for interactivity, and CSS for présentation. They can 
also include an inline-frame to point to another web page and represent its 
content. The gadget for the aggregator portai based on MLO-AD uses the 
OpenSocial 1 API for communicating with the social network service. Social 
applications based on OpenSocial can be integrated with ail social network 
services that support this APL Functions of OpenSocial allow, among others, 
information about the current user and friends to be received, activities that 
describe what the user is doing to be modified, and messages to contacts of 
the user to be sent. 

The social networking gadget for MLO-AD allows users to share learning 
opportunities with their friends. The "Share with Contacts" à' J button next 
to each search resuit opens a window with a list of the user’s friends. Ac- 
cordingly, the user can choose one or more contacts by clicking their names 
in the list. Hence, the System notifies these contacts of the educational event 
by sending a message (within the social network service — this feature is 
provided by ail services that support OpenSocial) including the information 
about the event. 

Many users of social network services like to inform their friends about their 
current activities. A session of a learning opportunity found by the MLO- 
AD gadget can also be referenced as an activity. A click on the button 
opens a window where users can indicate whether they will or hâve already 
attended this event or whether they are interested in an event but cannot 
participate because of a certain reason. This feature also enables the adding 
of a comment to the activity which includes the learning opportunity. After 
this process, a link to the learning opportunity, the user’s opinion of it, as 
well as the comment, are shown as the current activity of the user. 

Most social network services already allow the création of a group that ev- 
eryone or only certain people can participate in. A person interested in a 
learning opportunity or the instructor of the event can create a group re- 
lating to with it by clicking on the A button. Accordingly, friends can be 
invited to this group, discussions can be started, questions can be asked, and 
comments can be posted. 


1 http://www.opensocial.org 


Chapter 6 

The Prototype 


The first web application based on MLO-AD places its emphasis on the 
implémentation of data collection. It implements the recommended solution 
in Section 4.7 for aggregating MLO-AD data. Additionally, a simple user 
interface is provided to allow searches for learning opportunities. 

The following three components form the architecture of the prototype: The 
aggregator portai is the core element of the implémentation. It collects data 
about learning opportunities and provides the aggregated information for 
users or other services. The second part is the Sakai plug-in, which facilitâtes 
the support of data collection for learning opportunity providers, while the 
third element, the OpenSocial gadget, allows an intégration of the portal’s 
features with social networking services. 

The implementation’s aim is to meet the success criteria of the prototype that 
are described in Section 3.3. In a nutshell, the prototype aims to demonstrate 
the possibilities and applicabilities of the MLO-AD standard. 


6.1 Aggregator Portai with Google App Engine 

The aggregator portai based on MLO-AD is the connector between multiple 
providers of learning opportunities and prospective learners or services offer- 
ing information about learning opportunities. Its main target is the collection 
of data structured by MLO-AD. With a database full of aggregated learning 
opportunities, the portai can supply learners with their needed information 
in a structured and comprehensive way. 

The portai is developed with the programming language Java on Google App 
Engine (GAE) 1 . This complété development stack offers familiar technologies 

1 http://code.google.com/appengine 
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LinktoOpenSoclal Gadget 


Add Feed to List 



Java and OOP Basics 

Wed Aug 12 09:30.00 GMT+200 2009 at LocalSakaiName . Montreal, CRIM 
You will learn the basics of Java and OOP. 

Java and Co. 

Wed Jul 01 08:00:00 GMT+200 2009 at LocalSakaiName Montreal 
You will learn everything about Java Programmlng 
» Details 

Subject: Java Programming 
Duration: 4 weeks 
Language: english 
Number of Places: 5 
Prerequisite: Understanding of OOP 
Objective: Java 
Qualification: Master of Java 
Crédit: 1000 


Figure 6.1: Aggregator portai of the prototype 


to build and host web applications that can be easily maintained and scaled. 
The maintenance of the server is carried out by Google. 

The front-end of the prototype is created with Google Web Toolkit (GWT) 2 . 
This technology allows a highly performance JavaScript front-end including 
AJAX to be easily built. The code is written in the programming language 
Java, which is converted to JavaScript by GWT, and works across ail major 
browsers. Fig. 6.1 shows the portal’s front-end created with GWT. 

For the database, the prototype uses Java Data Object (JDO) 3 that makes 
the data of learning opportunities and their providers persistent. JDO works 
with annotations on Java classes ("plain old Java objects" or POJOs). These 
annotations define how to save instances of the class and which of its prop- 
erties. Instances are stored as entities in the database of App Engine. JDO 
includes a query interface (JDOQL) that allows an easy récréation of class 


2 http://code.google.com/webtoolkit 

3 http: / /java. sun.com/jdo 


6. The Prototype 


70 



instances from the entities of the database. The prototype’s database con- 
sists of various entity types. For the properties of MLO-AD, each resource 
of the standard’s data model (Learning Opportunity Provider, Learning Op- 
portunity Spécification, and Learning Opportunity Instance) forms a sin- 
gle entity type. Instances of these types are connected with properties like 
off ers, offeredAd, and specif iedFrom. Additionally, each MLO-AD object 
can refer to another instance of its type by using the hasPart reference. The 
database also includes other entities needed to save information for collecting 
data from learning opportunity providers. The complété database model of 
the prototype is shown in Fig. 6.2. 

The collection of MLO-AD data is implemented with the technology rec- 
ommended in Section 4.7. Therefore, the prototype includes an OAI-PMH 
harvester for MLO-AD metadata. Additionally, the portai provides an ag- 
gregator for Atom feeds that contain MLO-AD information to show a second 
suitable possibility for collecting information based on this spécifie standard. 
The following subsections describe the implémentations of both solutions. 

The prototype of the aggregator portai based on MLO-AD is published at 

http://mlo-app.appspot.com. 
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6.1.1 OAI-PMH Harvester 

The portal’s harvester collects information about learning opportunities via 
OAI-PMH from repositories that support this harvesting protocol. Links to 
these repositories are saved in the OaiRepoListEntry table of the database 
(see Fig. 6.2). OAI-PMH works with simple HTTP requests and six different 
verb parameters to define the action of the request. The OAI-PMH harvester 
of the MLO-AD portai uses two verbs for collecting and updating its data: 
ListMetadataFormats returns a list of the repository’s supported metadata 
formats and confirms the support of MLO-AD. The actual data about learn- 
ing opportunities are delivered via the ListRecords verb, and can be limited 
to MLO-AD data by adding a metadata prefix as a parameter. 

The response of the OAI-PMH request is an XML file, which is read by using 
the de facto standard Simple API for XML (SAX). SAX is an event-based 
parser that reads the document in real time [18, Ch. 20]. Therefore, the 
Processing can start before the entire document is read, which saves time 
and, most importantly, memory. 

After the first OAI-PMH request, the MLO-AD harvester reads the returned 
XML file (which contains the supported metadata formats) and checks if 
MLO-AD is included. If so, the request with the ListRecords verb is sent 
with attached parameters for the metadata prefix of MLO-AD, the date of 
the last update from OaiRepoListEntry as the from parameter, and the 
current date as the until parameter. The date parameters ensure that only 
new or updated OAI-PMH records are returned. Like the first response, the 
newly received XML file is parsed by the SAX parser and the information 
about learning opportunities is saved in the App Engine database. Due to 
the fact that this XML file is referenced to an XML schéma (whose reference 
is also included in the ListMetadataFormats response), parsing MLO-AD 
records from an OAI-PMH repository is relatively error-free. 


6.1.2 Atom Feed Aggregator 

The second technology for collecting MLO-AD data is based on Atom feeds. 
This portai includes an aggregator that parses Atom feeds and saves the 
information described by MLO-AD properties in the database. 

Links to the feeds, which are aggregated at regular intervals, are saved as a 
feed list in the database. This list is called AtomFeedListEntry (see Fig. 6.2), 
and includes the date of the last update for each entry. For an update of the 
data, the aggregator itérâtes this list and compares the date in the list with 
the update date of the Atom feed, which must be included. Therefore, only 
feeds that were modified since the last update are parsed. 
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To parse an Atom feed, the aggregator portai uses ROME 4 , which is a set 
of open source Java tools for parsing, generating and publishing RSS and 
Atom feeds. It is based on the JDOM 5 XML parser to generate Java objects 
from the feed’s XML source. ROME was chosen for the aggregator portai 
not only because of its enhancement of the parsing process, but also because 
of its easy intégration into Google App Engine. Although ROME facilitâtes 
reading an Atom feed, the content of the feed that includes XML tags with 
MLO-AD properties must be parsed "manually." The aggregator receives a 
String element of the feed’s content, searches for each valid property in this, 
and saves the appendant value in the database. 


6.1.3 Automatic Update of MLO-AD Data 

Google App Engine includes a Cron Service to execute scheduled tasks at 
spécifie times or regular intervals. These so-called cron jobs are commonly 
used in Unix-like computer operating Systems for an automatic update of 
data or the création of backup files 6 . In App Engine, the cron. xml file defines 
cron jobs with spécifie URLs that are invoked at a given time of day. 

The aggregator portai based on MLO-AD needs scheduled tasks to collect 
regularly information about learning opportunities and to keep the data up- 
to-date. The following lines show the cron. xml of the prototype. It contains 
a cron job that updates the data of ail Atom feeds every day at midnight 
New York time zone. Every day at 12:30 a.m. New York time zone, the 
second cron job invokes an URL to harvest and update data from ail known 
OAI-PMH repositories. 


1 <?xml version="l . 0" encoding="UTF-8"?> 

2 <cronentries> 

3 <cron> 

4 <url>/mloapp/mloAtom</url> 

5 <description>Updates data of ail Atom feeds from the feed list.</ 

description> 

6 <schedule>every day 00 : 00</schedule> 

7 <timezone>America/New_York</timezone> 

8 </cron> 

9 <cron> 

10 <url>/mloapp/mlo0ai</url> 

11 <description>Updates data of ail referenced OAI-PMH repositories . </ 

description> 

12 <schedule>every day 00 : 30</schedule> 

13 <timezone>America/New_York</timezone> 

14 </cron> 

15 </cronentries> 


4 https://rome.dev .java, net 
6 http://www.jdom.org 

0 http://www.unixgeeks.org/security/newbie/unix/cron-l.html 
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6.1.4 Search Functionality 

The prototype includes only one input field to search for learning opportu- 
nities. Although the user interface of the prototype does not cover the pos- 
sibility of switching to an advanced search view, the search allows a small 
version of an advanced search due to the intégration of Compass 7 . Compass 
is a Java search engine framework developed as an open source project and 
built on the top of Lucene 8 , which is an Apache project for search software. 
Like JDO, Compass uses annotations to define the searchable properties of a 
plain old Java object (PO JO). Compass indexes the data to provide search 
results in a fast way. Another reason for choosing Compass as the prototype’s 
search engine framework is its easy intégration with Google App Engine. 

For now, the prototype only allows searches for properties of a learning op- 
portunity spécification. Only this Java class is enriched with Compass anno- 
tations. In particular, the properties title, subj ect, and type are searchable 
with the prototype. The syntax of a combined search query looks as follows, 
whereas the last three parts are optional. 

1 [keyword] title : [keyword] sub ject : [keyword] type : [keyword] 


Search results, which are displayed underneath the search field as in Fig. 6.1, 
are shown in a simple way. A sorting of results by spécifie properties is not 
supported by the prototype. In addition, the possibility to change the view 
from text display to a map or calendar has not been implemented yet. 


6.2 Sakai Plug-In 

Many educational institutions use a course management System like Sakai 9 or 
Moodle 10 for managing their courses and resources. The prototype provides 
a Sakai plug-in (developed for version 2.5.4) to facilitate the provision of 
information based on MLO-AD for institutions using Sakai. 

Sakai is built with Java and the Spring framework 11 , and consists of dif- 
ferent tools for coursework, communication, and collaborative work. New 
components can easily be added as tools or as simple Java web servlets. 

To support both data collection technologies of the prototype’s aggregator, 
the Sakai plug-in includes an interface for OAI-PMH and a feed generator 

7 http://www.com pass-project.org 

8 http://lucene.apache.org 

9 http://sakaiproject.org 
10 http://moodle.org 

11 http://www.springsource.org 
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for Atom feeds with MLO-AD properties. Both features are implemented as 
Java web servi ets and can be invoked via their servlet URL pattern. Sakai 
has different possibilities to describe properties of courses or the institu- 
tion, which are, for example, course site descriptions, as well as general or 
config information in a properties file. However, not ail MLO-AD-required 
information can be supported by Sakai. Therefore, the Sakai plug-in addi- 
tionally contains a template for describing courses, which is enriched with 
RDFa annotations. This template is easy to manage and also demonstrates 
the practicability of RDFa. 


6.2.1 RDFa Template 

Each course in Sakai uses a spécifie Sakai site, which can contain different 
tools such as syllabus, resources, calendar, portfolio, and chat. Sites in Sakai 
can also include a general site description to give information about the 
course. Sakai provides different templates for these descriptions, which can 
be used for a spécifie structure of the information. 

The Sakai plug-in of the aggregator portai attaches a new site description 
template to the available ones of Sakai. This template is a simple XHTML 
table with ail valid properties of MLO-AD, whereby the user can easily add 
the spécifie values to the fields. The code of the table is enriched with RDFa 
annotations that provide semantics for machines. Therefore, the template 
does not only facilitate the support of MLO-AD information for learning 
opportunity providers, but also allows the extraction of MLO-AD data from 
this site for web bots or aggregators. The following lines illustrate an example 
of the template’s source code. 

1 Ctable width="100'/." cellspacing="0" cellpadding="0" border="0" xmlns:xsd 

="http : //www .w3 . org/2001/XMLSchema" xmlns : dc="http : //purl . org/dc/ 
elements/1 .1/" xmlns :mlo="http: / /mlo-app . appspot . com/mlo/elements/" 

2 <tbody> 

3 <tr><td>Title</td><td property="dc : title">Java and OOP Basics</td></ 

4 <trXtd>Subject</tdXtd property="dc : subject">Java</td></tr> 

5 <trXtd>Description</tdXtd property="dc : description">You will learn 

the basics of Java and OOP . </td></tr> 

6 <trXtd>Contributor</tdXtd property="dc : contributor">Katharina</td 

></tr> 

7 <trXtd>Qualif ication</tdXtd property="mlo : qualification" >&nbsp; </ 

tdx/tr> 

8 <trXtd>Level</tdXtd property="mlo : level">&nbsp;</tdx/tr> 

9 <trXtd>Credits</tdXtd property="mlo : credit">&nbsp;</tdx/tr> 

10 <trXtd>Location</tdXtd property="mlo : location" >Montreal , CRIM</td 

></tr> 

11 <trXtd>Start Date</tdXtd datatype="xsd : dateTime" property="mlo : 

start" >2009-08- 12 09 : 30 :00</tdX/tr> 
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12 <tr><td>Duration</td><td property="mlo :duration">6 h</tdx/tr> 

13 <trXtd>Cost</tdXtd property="mlo : cost">250 CAD</tdX/tr> 

14 <trXtd>Prerequisite</tdXtd property="mlo : prerequisite">some</td></ 

tr> 

15 <trXtd>Max. Wumber of Places</tdXtd property="mlo :places">15</td 

></tr> 

16 <trXtd>Language of Instruction</tdXtd property="mlo : 

languageOf Instruction">english</tdX/tr> 

17 <trXtd>Engagement</tdXtd property="mlo : engagement ">daytime , 

workplace-based</tdX/tr> 

18 <trXtd>0bjective</tdXtd property="mlo : objective">Java</td></tr> 

19 <trXtd>Assessment</tdXtd property="mlo : assessment ">&nbsp ; </td></tr 

20 </tbody> 

21 </table> 


6.2.2 OAI-PMH Interface 

To be able to accept OAI-PMH requests, Sakai needs an interface that pré- 
parés its data to support this protocol. Since implémentations for OAI-PMH 
repositories already exist, the Sakai plug-in uses the open source framework 
OAICat 12 developed by the Online Computer Library Center (OCLC) 13 un- 
der the OCLC Research Public License 2.0. OAICat significantly facilitâtes 
the support of OAI-PMH, but still needs slight adjustments to MLO-AD and 
the data System of the learning opportunity provider or Sakai. 

MLO-AD can be added to the OAICat framework by specifying its prefix, 
namespace and a valid XML schéma. Due to integrated Dublin Core éléments 
of MLO-AD, this standard is a valid metadata format for OAI-PMH. The 
Sakai plug-in takes the MLO-AD XML schéma 14 created by Mark Stubbs 
from the MLO-AD working group as a reference. 

OAICat already supports different data Systems such as file System, database, 
and Java Database Connectivity (JDBC). However, it can always be ex- 
tended by individual implémentations to support another data repository. 
The prototype’s OAI-PMH interface stays simple and uses the file System as 
the source of its data. It takes the information about learning opportunities 
and their providers from XML files and converts them for the transfer via 
OAI-PMH. These XML files comply with the spécifications of MLO-AD’s 
XML schéma. The OAI-PMH interface of the prototype does not use the 
information of Sakai yet, but will meet this need in the future. 

12 http: / /www.oclc.org/research /software/oai /cat. htm 

13 http://www.oclc.org 

14 http://wiki.teria.no/confluence/display/CIF/MLO-AD+illustrative+XML+binding 
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6.2.3 Atom Feed Generator 

For the second technology of data collection, the Sakai plug-in uses an Atom 
feed generator that créâtes or updates a feed whenever a modification of a site 
description including the MLO-AD template with RDFa annotations occurs. 
This happens via a JavaScript function which is attached to Sakai’s site 
description and invokes the URL of the Atom generator servlet. Therefore, 
MLO-AD information in Sakai is always directly written in an Atom feed. 

The generator reads the information of the site description and uses Apache 
Abdera 15 to generate the feed. Abdera is an open source Atom implémen- 
tation for the IETF Atom Syndication Format and Atom Publishing Pro- 
tocol standards. Properties of MLO-AD are written within XML tags in 
the content element of a feed entry. To define these properties as MLO- 
AD éléments, the tags include an XML namespace extension of MLO-AD. 
Information of the learning opportunity provider is specified by children of 
the feed element. Each entry of a feed describes a learning opportunity 
spécification including information about one or more appendant learning 
opportunity instances. The following feed is an example that contains one 
entry specifying a learning opportunity. 


1 <?xml version= '1.0' encoding= 1 UTF-8 1 ?> 

2 <feed xmlns="http : //www . w3 . org/2005/Atom"> 

3 <id>tag :gtn. sakaiquebec . org,2009 : /mlo-rdf a/mlo . xml</id> 

4 <title type="text">GTN Quebec Sakai</title> 

5 CauthorXname>GTN QuebecC/nameX/author> 

6 CcontributorXname>GTN QuebecC/nameX/contributor> 

7 Clink href="http: //gtn. sakaiquebec . org/portal" rel="alternate" /> 

8 Clink href="http: //gtn. sakaiquebec . org/mlo-rdf a/mlo . xml" rel="self"/> 

9 <updated>2009-06-22T19 : 00 : 01 . 293Z</updated> 

10 <entry> 

11 <id>tag: gtn. sakaiquebec . org, 2009 : /port al/site/ ld9c9fbc-4ded-4755-82 

f d-la58b70c04b6</id> 

12 Ctitle type="text">Java and OOP Basics</title> 

13 <updated>2009-06-20T16 : 10 : 55 ,948Z</updated> 

14 <published>2009-05-18T04 : 30 : 23 . 579Z</published> 

15 Clink href ="http : //gtn. sakaiquebec . org/portal/site/ld9c9fbc-4ded 

-4755-82fd-la58b70c04b6"/> 

16 CcontributorXname>KatharinaC/nameX/contributor> 

17 Csummary type="text">You will learn the basics of Java and 00P.C/ 

summary> 

18 Ccategory term="Java" /> 

19 Ccontent type="application/xml"> 

20 Cdc : subject xmlns : dc="http : //purl . org/dc/elements/1 . l/">JavaC/dc : 

subject> 

21 Cmlo : location xmlns :mlo="http : / /mlo-app . appspot . com/mlo/elements/"> 

Montreal, CRIMC/mlo : location> 


’ http://abdera.apache.org 
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22 <mlo : start xmlns :mlo="http : //mlo-app . appspot . com/mlo/elements 

/">2009-08-12T07 :30 :00 . 000Z</mlo : start> 

23 <mlo : duration xmlns :mlo="http : / /mlo-app . appspot . com/mlo/elements 

/">6 h</mlo :duration> 

24 <mlo : cost xmlns :mlo="http: //mlo-app . appspot . com/mlo/elements/ ">250 

CAD</mlo : cost> 

25 <mlo : languageOf Instruction xmlns :mlo="http: //mlo-app . appspot . corn/ 

mlo/elements/">english</mlo : languageOf Ins truction> 

26 <mlo : places xmlns :mlo="http: / /mlo-app . appspot . com/mlo/elements 

/">15</mlo:places> 

27 <mlo engagement xmlns :mlo="http : //mlo-app . appspot . com/mlo/elements 

/">daytime, workplace-based</mlo : engagement > 

28 </content> 

29 </entry> 

30 </feed> 


6.3 OpenSocial Gadget 


The prototype implémentation of the aggregator portai based on MLO-AD 
additionally covers an XML file which is a social gadget for social networking 
services. It can be integrated with ail services that support the OpenSocial 
API, which are, among others, MySpace 16 , Linkedln 17 , Xing 18 , and Hi5 19 . 

This gadget is developed according to the XML schéma of OpenSocial. The 
code below shows the full XML document of the current version. For the 
prototype, the gadget is a simple implémentation that allows learning op- 
portunities to be searched, but does not include social features like sharing 
educational events with friends. Therefore, the OpenSocial XML file only 
contains an inline-frame, pointing to a web page of the aggregator portai 
that is adjusted to the view of the social gadgets. An OpenSocial gadget can 
also specify CSS for the présentation, which would be defined by the style 
tag at line 14. The script tag from lines 15 to 17 contains the JavaScript 
code, which usually covers OpenSocial JavaScript functions to allow commu- 
nication with the social environment. The implémentation of the prototype 
only adjusts the height of the gadget and does not support social activity 
yet. 

1 <?xml version="l . 0" encoding="UTF-8" ?> 

2 <Module> 

3 <ModulePrefs title="ML0 Search" 

4 description="MLO Search allows searching for learning opportunities 

based on MLO-AD" 

16 http://www.myspace.com 

17 http://www.li nkedin.com 

18 http://www.xing.com 

19 http://www.hi5. corn 
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Figure 6.3: Screenshot of the prototype version of the OpenSocial gadget 


5 author="Katharina Bauer-Oeppinger "> 

6 <Require f eature="opensocial-0 . 8"/> 

7 <Require f eature=" views " /> 

8 <Require f eature="dynamic-height" /> 

9 </ModulePref s> 

10 «Montent type="html" view="def ault , home .profile , canvas "> <![CDATA[ 

11 <iframe src ="http : //mlo-app . appspot . com/MLOAppGadget . html" width 

= "100 , /. M height="100'/,"> 

12 <p>Your browser does not support if rames. </p> 

13 </iframe> 

14 <style type="text/css"X/style> 

15 <script type="text/javascript"> 

16 gadgets .window. adjustHeight () ; 

17 </ script> 

18 ]]> 

19 </Content> 

20 </Module> 


After the search process, the results are displayed within the gadget in a 
simple way. The prototype does not support a sort of the results by spécifie 
properties. Fig. 6.3 shows the prototype gadget with search results. 


Chapter 7 

Conclusions 


This thesis results in the analysis of data collection techniques specialized 
for the MLO-AD standard, the exploration of the fundamental design of a 
portai that offers learning opportunities, as well as the implémentation of a 
prototype. 

Different technologies, particularly those already used by existing aggregator 
Systems, are analyzed according to their usability for collection of MLO-AD 
data. Due to the complex structure of MLO-AD, information based on this 
standard has to be enriched with semantics by using XML or a Semantic 
Web technology. The analysis shows that XML is used in many solutions 
(web feeds, web services, and OAI-PMH), due to the fact that it is the 
prévalent standard for exchanging data between applications over the Web. 
Additionally, it is well-known, commonly used, and can specify its struc- 
ture with an XML schéma. To comply with MLO-AD, the structure of the 
XML must match the data model of this standard. This can be ensured by 
a schéma like the one created by Mark Stubbs from the MLO-AD working 
group 1 . The analyses of Semantic Web technologies (RDF, OWL, RDFa, and 
microformats) consider the possibilities for enhancing MLO-AD information 
with semantics to make it machine understandable. Although these technolo- 
gies are strong languages for describing MLO-AD content, they are limited 
in the actual transfer of the data from the learning opportunity provider to 
the aggregator portai. Technologies for transferring RDF exist (RSS 1.0 and 
SPARQL), but they hâve not been further described in this thesis due to the 
fact that they are not commonly used. 

Although this thesis analyzes the possibilities to describe and transfer MLO- 
AD data in detail, it does not consider the content of the data itself. The 
MLO-AD spécification does not address vocabularies needed to ensure an 

1 http://wiki.teria.no/confluence/display/CIF/MLO-AD+illustrative+XML+binding 
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easy comparison of learning opportunités and an interoperability between 
different educational domains. Especially for properties like type, qualifica- 
tion, level, and engagement, predetermined values would make learning op- 
portunités and their providers easily comparable. Additionally, the spécifi- 
cation of MLO-AD does not always clearly déclaré the meaning of properties. 
For example, the date property (which is included in learning opportunity 
providers, spécifications, as well as instances and is a different date than the 
start date of an instance — see Fig. 2.1 of Section 2.2) can be interpreted 
as the date of the last update from the provider, of the last update from 
the collector or of the resource’s création. The reason for not addressing 
vocabularies is that these values must be updated frequently. Therefore, sep- 
arate CEN Workshop Agreements (CWAs) will deal with vocabularies for 
MLO-AD properties [39, pp. 2-3]. Guidelines for values of MLO-AD proper- 
ties would facilitate the use of the standard and the comparison of learning 
opportunities and their providers. 

Considering the design exploration of the aggregator portai based on MLO- 
AD, this thesis offers different and contemporary ideas for presenting the con- 
tent or interacting with the System. Thereby, it specially addresses the needs 
and expectations of the target audience, which is analyzed in Section 3.1. 
Additionally, the exploration responds to the internet consumer phenomenon 
of social networking with an intégration of the aggregator portai with such 
a System. One issue that is not considered in the design analysis is the fact 
that a learning opportunity can hâve a reference to another one, which is 
identified by the hasPart property of MLO-AD. Possibilities to represent this 
relation should be analyzed and realized in future work. 

The implemented prototype of the aggregator portai based on MLO-AD 
compiles with following requirements of Section 3.3: 

• The database for saving the content of learning opportunities and their 
providers is based on MLO-AD. 

• The portai aggregates information about educational offers and insti- 
tutions automatically and at regular intervals. It does not only update 
existing data, but also considers new information. 

• A functionality for targeted searches is included. 

• The prototype offers a social gadget as an intégration into social net- 
working Systems. 

• This gadget provides an interface with an input field to search for 
learning opportunities. 

• Results of a search query are presented in a simple way. 
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The following two criteria hâve not been fully met; or rather they hâve been 
solved in a different way than was initially intended: 

• The prototype does not harvest data about learning opportunities 
from two different providers. However, it offers learning opportunity 
providers a plug-in for the Sakai course management System and im- 
plements two different technologies to collect data from this System. 
This plug-in includes components for an interface for handling OAI- 
PMH requests, a generator for creating Atom feeds with MLO-AD, 
and a template with RDFa annotations for describing MLO-AD con- 
tent. Those developments can easily be integrated with any existing 
Sakai 2.5.4. 

• Information about educational offers is not directly collected from the 
training center of CRIM. The prototype uses examples of CRIM that 
are added to Sakai. 

Besides the social gadget, the prototype offers a general web portai to search 
for learning opportunities. This portai additionally allows learning opportu- 
nity providers the possibility to add a link that refers to their OAI-PMH 
repository or Atom feed. Consequently, the aggregator System will automat- 
ically collect the data of this repository or feed at regular intervals. 

Pending developments of the prototype particularly consider the security of 
the System. To ensure a good quality of the offered information about learn- 
ing opportunities, the source and the provider of the information need to be 
identified. Therefore, it is essential that learning opportunity providers reg- 
ister and log in before they add a link that refers to their data. Additionally, 
only real educational offers and no spam data are allowed to be added to the 
database. Moreover, it still needs to manage the existing information about 
learning opportunities in the aggregator database. For example, educational 
offers must be removed from the database if they are not valid anymore. 

Another necessary amelioration of the prototype is the précisé adjustment 
of the data model to the structure of MLO-AD. Learning opportunities are 
already saved in the database based on the MLO-AD model. However, com- 
monly used properties of the provider, spécification, and instance are sepa- 
rately defined in each of them, and not in an abstract learning opportunity 
object as shown in Fig. 2.1 of Section 2.2. A refactoring of the data model 
and the code would make the latter clearer and more efficient. Moreover, 
each MLO-AD property in the prototype can hâve an occurrence of zéro or 
one (except that the identifier has exactly one, and offers can hâve more than 
one). The spécification of MLO-AD does not déclaré a maximum occurrence 
of its properties. A learning opportunity can hâve more values for the same 
property. This feature still must be implemented for the prototype. 
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The prototype is developed as an open source project under the Educational 
Community License, Version 1.0. Therefore, anyone who is interested in the 
project can contribute to the further development of this product. Many 
possible solutions and design suggestions described in this thesis hâve not 
been implemented yet. Furthermore, new ideas will corne up due to the 
developments of MLO-AD, the Web, as well as other technologies. These 
ideas are, of course, more than welcome. 



Appendix A 

CD-ROM Content 


Format: CD-ROM, Single Layer, ISO9660-Format 

A.l Diploma Thesis 

This diploma thesis is included in the CD-ROM as PDF, DVI, and PostScript 
documents. The copyright of this thesis lies with its author. 

Path: /thesis/ 


thesis.pdf Diploma thesis as PDF file 

thesis. dvi Diploma thesis as DVI file 

thesis.ps Diploma thesis as PostScript file 


Ail images of this thesis are provided as EPS files. Their copyright lies with 
the author of this thesis, unless otherwise stated. 

Path: /thesis/images/ 

*.eps Images of diploma thesis as EPS files 

Online sources used in this thesis are saved on the CD-ROM as PDF docu- 
ments. The copyright of these sources lies with the respective authors. 

Path: /thesis/bibliography/ 

* .pdf Copies of referenced online sources 
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A. 2 Source Code 

The CD-ROM also contains the implémentation of the prototype. The source 
code and ail necessary files for the installation are saved in the Aggregator 
Portai and Sakai Plug-in folders, whereas the latter provides divisions for 
the three different components of the plug-in. 

Path: /prototype/aggregatorPortal/ 


MLOApp/ Aggregator portai created with GAE 

INSTALL.txt Information about the installation 

mlo- gadget. xml XML file of OpenSocial gadget 


Path: /prototype/sakaiPlugin/AtomFeedGenerator/ 


atom-rdfa-src/ Source code of new Sakai component 

INSTALL.txt Information about the installation 


Path: /prototype/sakaiPlugin/OAI-PMH/ 


oaicat-src/ Source code of modified OAICat 

INSTALL.txt Information about the installation 

mlo_ad.xsd Copy of Mark Stubbs’ XML schéma 

mlo_example.xml . . . Example of an XML file containing MLO-AD 

oaicat.jar JAR File created from source code 

oaicat.properties .... Modified properties file for OAICat 
oaicat.war Copy of original OAICat WAR file 


Path: /prototype/sa kaiPlugin/RDFaTem plate/ 


reference/ Modified Sakai’s reference component 

site/ Modified Sakai’s site component 

velocity/ Modified Sakai’s velocity component 

INSTALL.txt Information about the installation 



Bibliography 


[1] Adida, B. and M. Birbeck: RDFa Primer: Bridging the Human and Data 
Web — W3C Working Group Note 14 October 2008. World Wide Web 
Consortium, October 2008. http://www.w3.org/TR/xhtml-rdfa-primer. 

[2] Alesso, H. P. and C.F. Smith: Thinking on the Web: Berners-Lee, Gôdel, 
and Turing. John Wiley & Sons Inc., 2006. 

[3] Allemang, D. and J. Hendler: Semantic Web for the Working Ontologist: 
Effective Modeling in RDFS and OWL. Morgan Kaufmann Publishers, 
2008. 

[4] Allsopp, J.: Microformats: Empowering Your Markup for Web 2.0. 
friends of ED, 2007. 

[5] Beckett, D. and J. Broekstra: SPARQL Query Results XML Format - 
W3C Recommendation 15 January 2008. World Wide Web Consortium, 
January 2008. http://www.w3.org/TR/rdf-sparql-XMLres. 

[6] Berners-Lee, T.: Weaving the Web. Harper Collins Publishers Inc., 1999. 

[7] Berners-Lee, T., J. Hendler, and O. Lassila: The Semantic Web. Sci- 
entific American, 284:34-43, May 2001. http://www.sciam.com/article. 
cfm?id=the-sema ntic- web. 

[8] Bouras, C., G. Kounenis, and I. Misedakis: A Web Content Manipula- 
tion Technique Based on Page Fragmentation. Journal of Network and 
Computer Applications, 30:563-585, 2007. 

[9] Brickley, D. and R. Guha: RDF Vocabulary Description Language 1.0: 
RDF Schéma - W3C Recommendation 10 February 200f. World Wide 
Web Consortium, February 2004. http://www.w3.org/TR/rdf-schema. 

[10] Butow, E.: User Interface Design for Mere Montais. For Mere Mortals 
Sériés. Addison-Wesley, 2007. 


85 



Bibliography 


86 


[11] Clark, K. G., L. Feigenbaum, and E. Torres: SPARQL Protocol for RDF 
- W3C Recommendation 15 January 2008. World Wide Web Consor- 
tium, January 2008. http://www.w3.org/TR/rdf-sparql-protocol. 

[12] Cole, T.W. and M. Foulonneau: Using the Open Archives Initiative Pro- 
tocol for Metadata Harvesting. Third Millennium Cataloging. Libraries 
Unlimited, 2007. 

[13] Confédération of EU Rectors’ Conférences and Association of European 
Universities (CRE): The Bologna Déclaration on the European Space 
for Higher Education: An Explanation. The European Higher Educa- 
tion Area, February 2000. http://ec.europa.eu/education/policies/educ/ 
bologna/bologna.pdf, includes the Joint Déclaration of the European 
Ministers of Education convened in Bologna on the 19th of June 1999. 

[14] David, D.M., L. Goertz, B. Hildebrandt, A. Janus, H.M.P. Sohn, U. Re- 
ichel, A. Reisky, L. Retë, C. Stracke, P. Tesch, and K. Unverricht: PAS 
1068 — Learning, Education, and Training uiith Spécial Considéra- 
tion of e-Learning: Guideline for the Description of Educational Ojfers. 
Beuth Verlag GmbH, December 2006. 

[15] Graf, A.: RDFa vs. Microformats: A Comparison on Inline Metadata 
Formats in (X)HTML. Techn. rep., DERI — Digital Enterprise Re- 
search Institute Innsbruck, Technikerstratëe 21a, A-6020 Innsbruck, Aus- 
tria, April 2007. 

[16] Haas, H. and A. Brown: Web Services Glossary — W3C Working Group 
Note 11 February 200f. World Wide Web Consortium, February 2004. 
h tt p : / / www .w3.org/TR/ws-gloss. 

[17] Hammersley, B.: Developing Feeds with RSS and Atom. O’Reilly, 2005. 

[18] Harold, E.R. and W.S. Means: XML in a Nutshell. O’Reilly, 3rd ed., 
2004. 

[19] Heaton, J.: Programming Spiders, Bots, and Aggregators in Java. Sybex, 

2002 . 

[20] Jenssen, A.: ECTS Information Package/ Course Catalogue with HTML- 
and XML-Bindings. University of Oslo, October 2004. http://www.usit. 
uio.no/prosjekter/eSU/eSU-revisjon/CDM/cdm-ects.pdf. 

[21] Johnson, D.: RSS and Atom in Action: Web 2.0 Building Blocks. Man- 
ning Publications Co., 2006. 

[22] Kashyap, V., C. Bussler, and M. Moran: The Semantic Web: Semantics 
for Data and Services on the Web. Springer, 2008. 



Bibliography 


87 


[23] King, R.: The University in the Global Age. Universities into the 21st 
Century. Palgrave Macmillan, 2004. 

[24] Lagoze, C., H. Van de Sompel, M. Nelson, and S. Warner: The Open 
Archives Initiative Protocol for Metadata Harvesting. Open Arch- 
ives Initiative, December 2008. http://www.openarchives.Org/OAI/2.0/ 
openarchivesprotocol.htm, Protocol Version 2.0 of 2002-06-14. 

[25] Manola, F. and E. Miller: RDF Primer - W3C Recommendation 10 
February 200f. World Wide Web Consortium, February 2004. http: 
//www. w3.org/TR/rdf-primer. 

[26] McGuinness, D.L. and F. van Harmelen: OWL Web Ontology Language 
OverView - W3C Recommendation 10 February 200 4- World Wide Web 
Consortium, February 2004. http://www.w3.org/TR/owl-features. 

[27] Metamatrix Development & Consulting AB: EMIL — Education Infor- 
mation Markup Language , 2004. http://www.elframework.org/projects/ 
xcri/EMIL_ P M%20v. 1.0. pdf/download. 

[28] Myllymaki, .J.: Effective Web Data Extraction with Standard XML Tech- 
nologies. Computer Networks, 39:635-644, 2002. 

[29] Prud’hommeaux, E. and A. Seaborne: SPARQL Query Language for 
RDF - W3C Recommendation 15 January 2008. World Wide Web 
Consortium, January 2008. http://www.w3.org/TR/rdf-sparql-query. 

[30] Pézeril, M.: Course Description Metadata (CDM): A Relevant and 
Challenging Standard for Universities, 2006. http://ariane.unil.ch/pdf/ 
seminar/e- Quality_WS2_MPezeril_article.pdf. 

[31] Ravaioli, S. and S. Velay: Supporting the Bologna Process with IT Stan- 
dards - New Emerging European Group for the Définition of Data Ex- 
change Standards between Higher Education Institutions. Kion, Unisolu- 
tion, November 2007. http://www.moveonnet.eu/institutions/standards/ 
downloads/rome20071109_finalreport.pdf, Report from the European 
workshop, 9th November 2007 in Rome. 

[32] Reese, T. and K. Banerjee: Building Digital Libraries. No. 153 in How- 
To-Do-It Manuals. Neal-Schuman Publishers Inc., 2008. 

[33] Richardson, L. and S. Ruby: RESTful Web Services. O’Reilly, 2007. 

[34] Stubbs, M.: XCRI (eXchanging C ourse- Related Information). Techn. 
rep., Manchester Metropolitan University, Aytoun Street, Manchester, 
Ml 3GH, November 2006. 

[35] Studer, R., S. Grimm, and A. Abecker: Semantic Web Services: Con- 
cepts, Technologies and Applications. Springer, 2007. 



Bibliography 


[36] The Nielsen Company: Global Faces and Networked Places. Nielsen, 
March 2009. http://blog.nielsen.com/nielsenwire/wp-content/uploads/ 
2009/03/nielsen_globalfaces_mar09.pdf, A Nielsen Report on Social 
Networking’s New Global Footprint. 

[37] Valentine, A. and R. Skeel: The Business Case for the Development 
of a PESC Standard for an XML Format for Course Catalogs. Let- 
ter of Intent of Postsecondary Electronic Standard Council, April 
2005. http://www.pesc.org/library/docs/workgroups/course%20catalog% 
20workgroup/LetterOfl ntent.pdf. 

[38] 0verby, E T. Hoel, and S. Wilson: The History of MLO-AD. Hypatia 
AS, June 2008. http://www.slideshare.net/toreh/mlo-history-2008-06-24. 

[39] 0verby, E., C.M. Straeke. A. Heath, P. Karlberg, J. Pawlowski, T. Hoel, 
M. Collett, et ai: Metadata for Learning Opportunities (MLO) — Ad- 
vertising. ISO/IEC JTC1 SC36 - Information Technology for Learning, 
Education, and Training, October 2008. http://www.percolab.com/gtn/ 
images/b/b5/MLO-AD.pdf. 

[40] Weerawarana, S., F. Curbera, F. Leymann, T. Storey, and D.F. Fer- 
guson: Web Services Platform Architecture: SOAP, WSDL, WS-Policy, 
WS-Addressing, WS-BPEL, WS-Reliable Messaging, and More. Pren- 
tice Hall PTR, 2005. 

[41] Witten, I.H., M. Gori, and T. Numerico: Web Dragons: Inside the Myths 
of Search Engine Technology. Morgan Kaufmann Publishers, 2007. 

[42] Yee, R.: Pro Web 2.0 Mashups: Remixing Data and Web Services. The 
Expert’s Voice in Web Development. Apress, 2008. 



Messbox zur Druckkontrolle 


— Druckgrôtëe kontrollieren! — 


Breite = 100 mm 
Hôhe = 50 mm 


— Diese Seite nach dem Druck entfernen! — 


89 




2012-03 

2012-02 

2012-01 

2011-06 

2011-05 

2011-04 

2011-03 

2011-02 

2011-01 

2010-01 

2009-06 

2009-05 

2009-04 

2009-03 


Publications du GTN-Québec 


Soutien au développement de ressources numériques pour l’enseignement 
et l’apprentissage dans les universités québécoises - Rapport complet. 
Rédigé par Line Cormier, Maureen Clapperton, Nicolas Gagnon, Michel 
Gendron, Robert Gérin-Lajoie et Jean Marcoux, 71 p. 

Soutien au développement de ressources numériques pour l’enseignement 
et l’apprentissage dans les universités québécoises - Les faits saillants. 
Rédigé par Line Cormier, Maureen Clapperton, Nicolas Gagnon, Michel 
Gendron, Robert Gérin-Lajoie et Jean Marcoux, 10 p. 

Manuels de cours numériques - droit d’auteur et gestion, inventaire des 
solutions disponibles version 1.1. Rédigé par Réjean Payette, 38 p. 

Les tableaux numériques interactifs : considérations d’interopérabilité. 

Rédigé par Marc-Antoine Parent, 28 p. 

Fédération d’identité pour les organismes de l’éducation. Rédigé par André 
Breton, 50 p. 

Compte-rendu de participation, 26 eme colloque annuel CSUN 2011. Rédigé 
par Denis Boudreau, 14 p. 

Les environnements d’apprentissage sont-ils en mutation ou en gestation? 
Rédigé par Pierre-Julien Guay, Marcel Borduas, Yves Otis, Robet Paré et 
Sacha Leprêtre, 21 p. 

Profil d’application québécois de métadonnées pour les opportunités d’étude, 
d’apprentissage et de formation (v.0.7.5) Rédigé par Gilles Gauthier, 93 p. 

Profil d’application Normetic 2.0 (vO.7.5) Rédigé par Gilles Gauthier, 41 p. 

Évaluation de fonctionnalités de traitement des métadonnées par Alfesco en 
comparaison avec Normetic. Rédigé par François Vincent, 9 p. 

Portrait des pratiques de sélection, de catalogage et de partage des 
documents numériques dans les bibliothèques. Rédigé par Marie-Chantal 
Dufour, 48 p. 

Accès aux contenus de formation en ligne : difficultés des apprenants 
handicapés et solutions pour assurer l’accessibilité des contenus. Rédigé par 
Denis Boudreau, 21 p. 

Développement MLO: Metadata forlearning opportunities. Rédigé par Olivier 
Gerbé et Thi-Lan-Anh Dinh, 32 p. 

Concept and Prototype of an Aggregator Portai for Learning Opportunities 
Based on the MLO-AD Standard. Rédigé par Katharina Bauer-Oppinger, 89 p. 


(autres publications à la quatrième de couverture) 



Publications du GTN-Québec (suite) 


2009-02 

2009-01 

2008-05 

2008-04 

2008-03 

2008-02 

2008-01 

2007-01 

2006-03 

2006-02 

2006-01 

2005-01 

2003-01 

2002-01 


Identification des caractéristiques des modèles de diffusion de contenus 
numériques : recension des dépôts numériques existants - Partie 2. Rédigé 
par Gabriel Dumouchel et Thierry Karsenti, 99 p. 

Identification des caractéristiques des modèles de diffusion de contenus 
numériques : revue de littérature - Partie 1. Rédigé par Gabriel Dumouchel et 
Thierry Karsenti, 54 p. 

Ressources d’apprentissage et normes : la situation au Québec. Rédigé par 
Christian Lafrance, 102 p. 

Guide d’élaboration de fiches descriptives de ressources d’enseignement et 
d’apprentissage selon Normetic vl.2, profil d’application québécois du 
standard Learning Object Metadata (LOM). Rédigé par Gérald Roberge, 57 p. 

Profil d’application Normetic 1.2. Rédigé par Gérald Roberge, 170 p. 

Tableau du code XML à produire pour le vocabulaire de l’élément 5.2 de 
Normetic 1.2. Rédigé par Gérald Roberge 

Tableau du code XML à produire pour le vocabulaire de l’élément 5.6 de 
Normetic 1.2. . Rédigé par Gérald Roberge 

Portrait général des stratégies d’assurance qualité des ressources 
d’enseignement et d’apprentissage (REA) : à l’attention des gestionnaires. 
Rédigé par Karin Lundgre-Cayrol, Suzanne Lapointe et lleana De la Teja, 25 p. 

Les normes, comment? Rédigé par Gérald Roberge, 4 p. 

Les normes, pourquoi? Rédigé par Gérald Roberge, 4p. 

Guide pour la sélection de REA. Rédigé par Gérald Roberge, 10 p. 

Le profil d’application Normetic, version 1.1. Rédigé par Robert Thivierge, 8 p. 

La description normalisée des ressources : vers un patrimoine éducatif - 
Normetic, version 1.0. Sous la supervision de la CREPUQ et Novasys inc., 

139 p. 

Les normes et standards de la formation en ligne - État des lieux et enjeux. 
Rédigé par Rachel Chouinard. Sous la supervision de la CREPUQ et du sous- 
comité SCTIC, 39 p. 


Pour télécharger ces publications ou pour la liste complète des publications du GTN-Québec, 
voir le site Web www.gtn-quebec.org/publications 



